Watching Jamie build the model

library(tidyverse)
library(GGally)
library(modelr)
library(janitor)
avocados <- clean_names(read_csv("data/avocado.csv"))
head(avocados)

Prepare the data

Ok, we have 14 variables. Can already see that some of them are somewhat useless (x1 for example). Not sure whether the total_bags variable is the sum of small_bags, large_bags and x_large_bags so I’ll check that first.

# check to see if total_bags variable is just the sum of the other three
avocados %>%
  mutate(total_sum = small_bags + large_bags + x_large_bags) %>%
  select(total_bags, total_sum)

Yep, the total_bags column is just a sum of the other three. So this is a another variable I can get rid of. I can also check the same for volume:

# check to see if total_volume variable is just the sum of the other three
avocados %>%
  mutate(total_sum = x4046 + x4225 + x4770) %>%
  select(total_volume, total_sum)

Nope, these aren’t the same, so we can keep all these in.

Now let’s check how many different levels of each categorical variable we have.

avocados %>%
  distinct(region) %>%
  summarise(number_of_regions = n())
avocados %>%
  distinct(date) %>%
  summarise(
    number_of_dates = n(),
    min_date = min(date),
    max_date = max(date)
  )

The region variable will lead to many categorical levels, but we can try leaving it in. We should also examine date and perhaps pull out from it whatever features we can. Including every single date would be too much, so we can extract the different parts of the date that might be useful. For example, we could try and split it into different quarters, or years.

So, let’s do this now. Remove the variables we don’t need, change our categorical variables to factors, and extract parts of the date in case they are useful (and get rid of date).

library(lubridate)
trimmed_avocados <- avocados %>%
  mutate(
    quarter = as_factor(quarter(date)),
    year = as_factor(year),
    type = as_factor(type),
    region = as_factor(region)
  ) %>%
  select(-c(x1, date,total_bags))

Now we’ve done our cleaning, we can check for aliased variables (i.e. combinations of variables in which one or more of the variables can be calculated exactly from other variables):

alias(average_price ~ ., data = trimmed_avocados )

Nice, we don’t find any aliases. So we can keep going.

FirstVariable

We need to decide on which variable we want to put in our model first. To do this, we should visualise it. Because we have so much data, ggpairs() might take a while to run, so we can split it up a bit.

# let's start by plotting the volume variables
trimmed_avocados %>%
  select(average_price, total_volume, x4046, x4225, x4770) %>%
  ggpairs() + 
   theme_grey(base_size = 8) # font size of labels

 plot: [1,1] [==>--------------------------------------------------------------]  4% est: 0s 
 plot: [1,2] [====>------------------------------------------------------------]  8% est: 2s 
 plot: [1,3] [=======>---------------------------------------------------------] 12% est: 2s 
 plot: [1,4] [=========>-------------------------------------------------------] 16% est: 1s 
 plot: [1,5] [============>----------------------------------------------------] 20% est: 1s 
 plot: [2,1] [===============>-------------------------------------------------] 24% est: 1s 
 plot: [2,2] [=================>-----------------------------------------------] 28% est: 2s 
 plot: [2,3] [====================>--------------------------------------------] 32% est: 1s 
 plot: [2,4] [======================>------------------------------------------] 36% est: 1s 
 plot: [2,5] [=========================>---------------------------------------] 40% est: 1s 
 plot: [3,1] [============================>------------------------------------] 44% est: 1s 
 plot: [3,2] [==============================>----------------------------------] 48% est: 1s 
 plot: [3,3] [=================================>-------------------------------] 52% est: 1s 
 plot: [3,4] [===================================>-----------------------------] 56% est: 1s 
 plot: [3,5] [======================================>--------------------------] 60% est: 1s 
 plot: [4,1] [=========================================>-----------------------] 64% est: 1s 
 plot: [4,2] [===========================================>---------------------] 68% est: 1s 
 plot: [4,3] [==============================================>------------------] 72% est: 1s 
 plot: [4,4] [================================================>----------------] 76% est: 1s 
 plot: [4,5] [===================================================>-------------] 80% est: 0s 
 plot: [5,1] [======================================================>----------] 84% est: 0s 
 plot: [5,2] [========================================================>--------] 88% est: 0s 
 plot: [5,3] [===========================================================>-----] 92% est: 0s 
 plot: [5,4] [=============================================================>---] 96% est: 0s 
 plot: [5,5] [=================================================================]100% est: 0s 
                                                                                             

Hmm, these look highly correlated with one another in some instances. This is a sign that we won’t have to include all of these in our model, so we could think about removing x4225 and x4770 from our dataset to give ourselves fewer variables.

trimmed_avocados <- trimmed_avocados %>%
  select(-x4225, -x4770)

In terms of variables that correlate well with average_price… well none of them do, that well. But that’s life. Our x046 variable is probably our first candidate.

Next we can look at our volume variables.

trimmed_avocados %>%
  select(average_price, small_bags, large_bags, x_large_bags) %>%
  ggpairs() + 
   theme_grey(base_size = 8) # font size of labels

 plot: [1,1] [===>-------------------------------------------------------------]  6% est: 0s 
 plot: [1,2] [=======>---------------------------------------------------------] 12% est: 1s 
 plot: [1,3] [===========>-----------------------------------------------------] 19% est: 1s 
 plot: [1,4] [===============>-------------------------------------------------] 25% est: 1s 
 plot: [2,1] [===================>---------------------------------------------] 31% est: 1s 
 plot: [2,2] [=======================>-----------------------------------------] 38% est: 1s 
 plot: [2,3] [===========================>-------------------------------------] 44% est: 1s 
 plot: [2,4] [===============================>---------------------------------] 50% est: 1s 
 plot: [3,1] [====================================>----------------------------] 56% est: 1s 
 plot: [3,2] [========================================>------------------------] 62% est: 0s 
 plot: [3,3] [============================================>--------------------] 69% est: 0s 
 plot: [3,4] [================================================>----------------] 75% est: 0s 
 plot: [4,1] [====================================================>------------] 81% est: 0s 
 plot: [4,2] [========================================================>--------] 88% est: 0s 
 plot: [4,3] [============================================================>----] 94% est: 0s 
 plot: [4,4] [=================================================================]100% est: 0s 
                                                                                             

Hmm, again… not that promising. Some of the variables are highly correlated with one another, but not much seems highly correlated with average_price.

We can look at some of our categorical variables next:

trimmed_avocados %>%
  select(average_price, type, year, quarter) %>%
  ggpairs() + 
   theme_grey(base_size = 8) # font size of labels

 plot: [1,1] [===>-------------------------------------------------------------]  6% est: 0s 
 plot: [1,2] [=======>---------------------------------------------------------] 12% est: 1s 
 plot: [1,3] [===========>-----------------------------------------------------] 19% est: 1s 
 plot: [1,4] [===============>-------------------------------------------------] 25% est: 1s 
 plot: [2,1] [===================>---------------------------------------------] 31% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [2,2] [=======================>-----------------------------------------] 38% est: 2s 
 plot: [2,3] [===========================>-------------------------------------] 44% est: 1s 
 plot: [2,4] [===============================>---------------------------------] 50% est: 1s 
 plot: [3,1] [====================================>----------------------------] 56% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [3,2] [========================================>------------------------] 62% est: 1s 
 plot: [3,3] [============================================>--------------------] 69% est: 1s 
 plot: [3,4] [================================================>----------------] 75% est: 1s 
 plot: [4,1] [====================================================>------------] 81% est: 0s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [4,2] [========================================================>--------] 88% est: 0s 
 plot: [4,3] [============================================================>----] 94% est: 0s 
 plot: [4,4] [=================================================================]100% est: 0s 
                                                                                             

This seems better! Our type variable seems to show variation in the boxplots. This might suggest that conventional avocados and organic ones have different prices (which again, makes sense).

Finally, we can make a boxplot of our region variable. Because this has so many levels, it makes sense to plot it by itself so we can see it.

trimmed_avocados %>%
  ggplot(aes(x = region, y = average_price)) +
  geom_boxplot() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))

Ok, seems there is some variation in the boxplots between different regions, so that seems like it could be promising.

Let’s start by test competing models. We decided that x4046, type, and region seemed reasonable:

library(ggfortify)

# build the model 
model1a <- lm(average_price ~ x4046, data = trimmed_avocados)

# check the diagnostics
autoplot(model1a)

# check the summary output
summary(model1a)

Call:
lm(formula = average_price ~ x4046, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.98539 -0.29842 -0.03531  0.25459  1.82475 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.425e+00  2.993e-03  476.29   <2e-16 ***
x4046       -6.631e-08  2.305e-09  -28.77   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3939 on 18247 degrees of freedom
Multiple R-squared:  0.0434,    Adjusted R-squared:  0.04334 
F-statistic: 827.8 on 1 and 18247 DF,  p-value: < 2.2e-16
# build the model 
model1b <- lm(average_price ~ type, data = trimmed_avocados)

# check the diagnostics
autoplot(model1b)

# check the summary output
summary(model1b)

Call:
lm(formula = average_price ~ type, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.21400 -0.20400 -0.02804  0.18600  1.59600 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1.158040   0.003321   348.7   <2e-16 ***
typeorganic 0.495959   0.004697   105.6   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3173 on 18247 degrees of freedom
Multiple R-squared:  0.3793,    Adjusted R-squared:  0.3792 
F-statistic: 1.115e+04 on 1 and 18247 DF,  p-value: < 2.2e-16
# build the model 
model1c <- lm(average_price ~ region, data = trimmed_avocados)

# check the diagnostics
autoplot(model1c)

# check the summary output
summary(model1c)

Call:
lm(formula = average_price ~ region, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.97095 -0.28423 -0.03432  0.25207  1.76115 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.561036   0.020006  78.029  < 2e-16 ***
regionAtlanta             -0.223077   0.028293  -7.885 3.33e-15 ***
regionBaltimoreWashington -0.026805   0.028293  -0.947  0.34344    
regionBoise               -0.212899   0.028293  -7.525 5.52e-14 ***
regionBoston              -0.030148   0.028293  -1.066  0.28663    
regionBuffaloRochester    -0.044201   0.028293  -1.562  0.11824    
regionCalifornia          -0.165710   0.028293  -5.857 4.79e-09 ***
regionCharlotte            0.045000   0.028293   1.591  0.11173    
regionChicago             -0.004260   0.028293  -0.151  0.88031    
regionCincinnatiDayton    -0.351834   0.028293 -12.436  < 2e-16 ***
regionColumbus            -0.308254   0.028293 -10.895  < 2e-16 ***
regionDallasFtWorth       -0.475444   0.028293 -16.805  < 2e-16 ***
regionDenver              -0.342456   0.028293 -12.104  < 2e-16 ***
regionDetroit             -0.284941   0.028293 -10.071  < 2e-16 ***
regionGrandRapids         -0.056036   0.028293  -1.981  0.04765 *  
regionGreatLakes          -0.222485   0.028293  -7.864 3.94e-15 ***
regionHarrisburgScranton  -0.047751   0.028293  -1.688  0.09147 .  
regionHartfordSpringfield  0.257604   0.028293   9.105  < 2e-16 ***
regionHouston             -0.513107   0.028293 -18.136  < 2e-16 ***
regionIndianapolis        -0.247041   0.028293  -8.732  < 2e-16 ***
regionJacksonville        -0.050089   0.028293  -1.770  0.07668 .  
regionLasVegas            -0.180118   0.028293  -6.366 1.98e-10 ***
regionLosAngeles          -0.345030   0.028293 -12.195  < 2e-16 ***
regionLouisville          -0.274349   0.028293  -9.697  < 2e-16 ***
regionMiamiFtLauderdale   -0.132544   0.028293  -4.685 2.82e-06 ***
regionMidsouth            -0.156272   0.028293  -5.523 3.37e-08 ***
regionNashville           -0.348935   0.028293 -12.333  < 2e-16 ***
regionNewOrleansMobile    -0.256243   0.028293  -9.057  < 2e-16 ***
regionNewYork              0.166538   0.028293   5.886 4.02e-09 ***
regionNortheast            0.040888   0.028293   1.445  0.14843    
regionNorthernNewEngland  -0.083639   0.028293  -2.956  0.00312 ** 
regionOrlando             -0.054822   0.028293  -1.938  0.05268 .  
regionPhiladelphia         0.071095   0.028293   2.513  0.01199 *  
regionPhoenixTucson       -0.336598   0.028293 -11.897  < 2e-16 ***
regionPittsburgh          -0.196716   0.028293  -6.953 3.70e-12 ***
regionPlains              -0.124527   0.028293  -4.401 1.08e-05 ***
regionPortland            -0.243314   0.028293  -8.600  < 2e-16 ***
regionRaleighGreensboro   -0.005917   0.028293  -0.209  0.83434    
regionRichmondNorfolk     -0.269704   0.028293  -9.533  < 2e-16 ***
regionRoanoke             -0.313107   0.028293 -11.067  < 2e-16 ***
regionSacramento           0.060533   0.028293   2.140  0.03241 *  
regionSanDiego            -0.162870   0.028293  -5.757 8.72e-09 ***
regionSanFrancisco         0.243166   0.028293   8.595  < 2e-16 ***
regionSeattle             -0.118462   0.028293  -4.187 2.84e-05 ***
regionSouthCarolina       -0.157751   0.028293  -5.576 2.50e-08 ***
regionSouthCentral        -0.459793   0.028293 -16.251  < 2e-16 ***
regionSoutheast           -0.163018   0.028293  -5.762 8.45e-09 ***
regionSpokane             -0.115444   0.028293  -4.080 4.52e-05 ***
regionStLouis             -0.130414   0.028293  -4.609 4.06e-06 ***
regionSyracuse            -0.040710   0.028293  -1.439  0.15020    
regionTampa               -0.152189   0.028293  -5.379 7.58e-08 ***
regionTotalUS             -0.242012   0.028293  -8.554  < 2e-16 ***
regionWest                -0.288817   0.028293 -10.208  < 2e-16 ***
regionWestTexNewMexico    -0.299334   0.028356 -10.556  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3678 on 18195 degrees of freedom
Multiple R-squared:  0.1681,    Adjusted R-squared:  0.1657 
F-statistic: 69.38 on 53 and 18195 DF,  p-value: < 2.2e-16

model1b with type is best, so we’ll keep that and re-run ggpairs() with the residuals (again omitting region because it’s too big).

Second Variable

avocados_remaining_resid <- trimmed_avocados %>%
  add_residuals(model1b) %>%
  select(-c("average_price", "type", "region"))

ggpairs(avocados_remaining_resid) + 
  theme_grey(base_size = 8) # this bit just changes the axis label font size so we can see

 plot: [1,1] [>----------------------------------------------------------------]  2% est: 0s 
 plot: [1,2] [=>---------------------------------------------------------------]  3% est: 4s 
 plot: [1,3] [==>--------------------------------------------------------------]  5% est: 5s 
 plot: [1,4] [===>-------------------------------------------------------------]  6% est: 5s 
 plot: [1,5] [====>------------------------------------------------------------]  8% est: 5s 
 plot: [1,6] [=====>-----------------------------------------------------------]  9% est: 4s 
 plot: [1,7] [======>----------------------------------------------------------] 11% est: 5s 
 plot: [1,8] [=======>---------------------------------------------------------] 12% est: 5s 
 plot: [2,1] [========>--------------------------------------------------------] 14% est: 4s 
 plot: [2,2] [=========>-------------------------------------------------------] 16% est: 4s 
 plot: [2,3] [==========>------------------------------------------------------] 17% est: 4s 
 plot: [2,4] [===========>-----------------------------------------------------] 19% est: 4s 
 plot: [2,5] [============>----------------------------------------------------] 20% est: 4s 
 plot: [2,6] [=============>---------------------------------------------------] 22% est: 4s 
 plot: [2,7] [==============>--------------------------------------------------] 23% est: 4s 
 plot: [2,8] [===============>-------------------------------------------------] 25% est: 4s 
 plot: [3,1] [================>------------------------------------------------] 27% est: 4s 
 plot: [3,2] [=================>-----------------------------------------------] 28% est: 4s 
 plot: [3,3] [==================>----------------------------------------------] 30% est: 4s 
 plot: [3,4] [===================>---------------------------------------------] 31% est: 3s 
 plot: [3,5] [====================>--------------------------------------------] 33% est: 3s 
 plot: [3,6] [=====================>-------------------------------------------] 34% est: 3s 
 plot: [3,7] [======================>------------------------------------------] 36% est: 3s 
 plot: [3,8] [=======================>-----------------------------------------] 38% est: 3s 
 plot: [4,1] [========================>----------------------------------------] 39% est: 3s 
 plot: [4,2] [=========================>---------------------------------------] 41% est: 3s 
 plot: [4,3] [==========================>--------------------------------------] 42% est: 3s 
 plot: [4,4] [===========================>-------------------------------------] 44% est: 3s 
 plot: [4,5] [============================>------------------------------------] 45% est: 3s 
 plot: [4,6] [=============================>-----------------------------------] 47% est: 3s 
 plot: [4,7] [==============================>----------------------------------] 48% est: 3s 
 plot: [4,8] [===============================>---------------------------------] 50% est: 2s 
 plot: [5,1] [=================================>-------------------------------] 52% est: 2s 
 plot: [5,2] [==================================>------------------------------] 53% est: 2s 
 plot: [5,3] [===================================>-----------------------------] 55% est: 2s 
 plot: [5,4] [====================================>----------------------------] 56% est: 2s 
 plot: [5,5] [=====================================>---------------------------] 58% est: 2s 
 plot: [5,6] [======================================>--------------------------] 59% est: 2s 
 plot: [5,7] [=======================================>-------------------------] 61% est: 2s 
 plot: [5,8] [========================================>------------------------] 62% est: 2s 
 plot: [6,1] [=========================================>-----------------------] 64% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,2] [==========================================>----------------------] 66% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,3] [===========================================>---------------------] 67% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,4] [============================================>--------------------] 69% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,5] [=============================================>-------------------] 70% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,6] [==============================================>------------------] 72% est: 2s 
 plot: [6,7] [===============================================>-----------------] 73% est: 2s 
 plot: [6,8] [================================================>----------------] 75% est: 2s 
 plot: [7,1] [=================================================>---------------] 77% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,2] [==================================================>--------------] 78% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,3] [===================================================>-------------] 80% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,4] [====================================================>------------] 81% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,5] [=====================================================>-----------] 83% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,6] [======================================================>----------] 84% est: 1s 
 plot: [7,7] [=======================================================>---------] 86% est: 1s 
 plot: [7,8] [========================================================>--------] 88% est: 1s 
 plot: [8,1] [=========================================================>-------] 89% est: 1s 
 plot: [8,2] [==========================================================>------] 91% est: 1s 
 plot: [8,3] [===========================================================>-----] 92% est: 1s 
 plot: [8,4] [============================================================>----] 94% est: 0s 
 plot: [8,5] [=============================================================>---] 95% est: 0s 
 plot: [8,6] [==============================================================>--] 97% est: 0s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [8,7] [===============================================================>-] 98% est: 0s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [8,8] [=================================================================]100% est: 0s 
                                                                                             

Again, this isn’t showing any really high correlations between the residuals and any of our numeric variables. Looks like x4046, year, quarter could show something potentially (given the rubbish variables we have).

trimmed_avocados %>%
  add_residuals(model1b) %>%
  ggplot(aes(x = region, y = resid)) +
  geom_boxplot() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))

Looks like region are our next contenders to try. Let’s do these now.

model2a <- lm(average_price ~ type + x4046, data = trimmed_avocados)
autoplot(model2a)

summary(model2a)

Call:
lm(formula = average_price ~ type + x4046, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.21416 -0.20029 -0.02736  0.18591  1.59589 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.171e+00  3.485e-03  336.13   <2e-16 ***
typeorganic  4.827e-01  4.802e-03  100.52   <2e-16 ***
x4046       -2.323e-08  1.898e-09  -12.24   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.316 on 18246 degrees of freedom
Multiple R-squared:  0.3843,    Adjusted R-squared:  0.3843 
F-statistic:  5695 on 2 and 18246 DF,  p-value: < 2.2e-16
model2b <- lm(average_price ~ type + year, data = trimmed_avocados)
autoplot(model2b)

summary(model2b)

Call:
lm(formula = average_price ~ type + year, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.32320 -0.18722 -0.01722  0.18278  1.66337 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.127645   0.004704 239.735  < 2e-16 ***
typeorganic  0.495980   0.004563 108.685  < 2e-16 ***
year2016    -0.036995   0.005817  -6.360 2.07e-10 ***
year2017     0.139580   0.005790  24.107  < 2e-16 ***
year2018    -0.028104   0.009499  -2.959  0.00309 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3082 on 18244 degrees of freedom
Multiple R-squared:  0.4142,    Adjusted R-squared:  0.4141 
F-statistic:  3225 on 4 and 18244 DF,  p-value: < 2.2e-16
model2c <- lm(average_price ~ type + quarter, data = trimmed_avocados)
autoplot(model2c)

summary(model2c)

Call:
lm(formula = average_price ~ type + quarter, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.11458 -0.20089 -0.02458  0.18542  1.54687 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 1.058626   0.004718  224.38   <2e-16 ***
typeorganic 0.495958   0.004543  109.16   <2e-16 ***
quarter2    0.068546   0.006282   10.91   <2e-16 ***
quarter3    0.206308   0.006281   32.84   <2e-16 ***
quarter4    0.152040   0.006237   24.38   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3069 on 18244 degrees of freedom
Multiple R-squared:  0.4193,    Adjusted R-squared:  0.4192 
F-statistic:  3294 on 4 and 18244 DF,  p-value: < 2.2e-16
model2d <- lm(average_price ~ type + region, data = trimmed_avocados)
autoplot(model2d)

summary(model2d)

Call:
lm(formula = average_price ~ type + region, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.09858 -0.16716 -0.01814  0.14692  1.51320 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.313079   0.014894  88.159  < 2e-16 ***
typeorganic                0.495912   0.004017 123.452  < 2e-16 ***
regionAtlanta             -0.223077   0.020871 -10.688  < 2e-16 ***
regionBaltimoreWashington -0.026805   0.020871  -1.284  0.19906    
regionBoise               -0.212899   0.020871 -10.201  < 2e-16 ***
regionBoston              -0.030148   0.020871  -1.444  0.14863    
regionBuffaloRochester    -0.044201   0.020871  -2.118  0.03421 *  
regionCalifornia          -0.165710   0.020871  -7.940 2.15e-15 ***
regionCharlotte            0.045000   0.020871   2.156  0.03109 *  
regionChicago             -0.004260   0.020871  -0.204  0.83826    
regionCincinnatiDayton    -0.351834   0.020871 -16.857  < 2e-16 ***
regionColumbus            -0.308254   0.020871 -14.769  < 2e-16 ***
regionDallasFtWorth       -0.475444   0.020871 -22.780  < 2e-16 ***
regionDenver              -0.342456   0.020871 -16.408  < 2e-16 ***
regionDetroit             -0.284941   0.020871 -13.652  < 2e-16 ***
regionGrandRapids         -0.056036   0.020871  -2.685  0.00726 ** 
regionGreatLakes          -0.222485   0.020871 -10.660  < 2e-16 ***
regionHarrisburgScranton  -0.047751   0.020871  -2.288  0.02216 *  
regionHartfordSpringfield  0.257604   0.020871  12.342  < 2e-16 ***
regionHouston             -0.513107   0.020871 -24.584  < 2e-16 ***
regionIndianapolis        -0.247041   0.020871 -11.836  < 2e-16 ***
regionJacksonville        -0.050089   0.020871  -2.400  0.01641 *  
regionLasVegas            -0.180118   0.020871  -8.630  < 2e-16 ***
regionLosAngeles          -0.345030   0.020871 -16.531  < 2e-16 ***
regionLouisville          -0.274349   0.020871 -13.145  < 2e-16 ***
regionMiamiFtLauderdale   -0.132544   0.020871  -6.351 2.20e-10 ***
regionMidsouth            -0.156272   0.020871  -7.487 7.35e-14 ***
regionNashville           -0.348935   0.020871 -16.718  < 2e-16 ***
regionNewOrleansMobile    -0.256243   0.020871 -12.277  < 2e-16 ***
regionNewYork              0.166538   0.020871   7.979 1.56e-15 ***
regionNortheast            0.040888   0.020871   1.959  0.05013 .  
regionNorthernNewEngland  -0.083639   0.020871  -4.007 6.16e-05 ***
regionOrlando             -0.054822   0.020871  -2.627  0.00863 ** 
regionPhiladelphia         0.071095   0.020871   3.406  0.00066 ***
regionPhoenixTucson       -0.336598   0.020871 -16.127  < 2e-16 ***
regionPittsburgh          -0.196716   0.020871  -9.425  < 2e-16 ***
regionPlains              -0.124527   0.020871  -5.966 2.47e-09 ***
regionPortland            -0.243314   0.020871 -11.658  < 2e-16 ***
regionRaleighGreensboro   -0.005917   0.020871  -0.284  0.77679    
regionRichmondNorfolk     -0.269704   0.020871 -12.922  < 2e-16 ***
regionRoanoke             -0.313107   0.020871 -15.002  < 2e-16 ***
regionSacramento           0.060533   0.020871   2.900  0.00373 ** 
regionSanDiego            -0.162870   0.020871  -7.803 6.35e-15 ***
regionSanFrancisco         0.243166   0.020871  11.651  < 2e-16 ***
regionSeattle             -0.118462   0.020871  -5.676 1.40e-08 ***
regionSouthCarolina       -0.157751   0.020871  -7.558 4.28e-14 ***
regionSouthCentral        -0.459793   0.020871 -22.030  < 2e-16 ***
regionSoutheast           -0.163018   0.020871  -7.811 6.00e-15 ***
regionSpokane             -0.115444   0.020871  -5.531 3.22e-08 ***
regionStLouis             -0.130414   0.020871  -6.248 4.24e-10 ***
regionSyracuse            -0.040710   0.020871  -1.951  0.05113 .  
regionTampa               -0.152189   0.020871  -7.292 3.18e-13 ***
regionTotalUS             -0.242012   0.020871 -11.595  < 2e-16 ***
regionWest                -0.288817   0.020871 -13.838  < 2e-16 ***
regionWestTexNewMexico    -0.297114   0.020918 -14.204  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2713 on 18194 degrees of freedom
Multiple R-squared:  0.5473,    Adjusted R-squared:  0.546 
F-statistic: 407.4 on 54 and 18194 DF,  p-value: < 2.2e-16

So model2d with type and region comes out as better here. We have some region coefficients that are not significant at 0.05 level, so let’s run an anova() to test whether to include region

# model1b is the model with average_price ~ type
# model2d is the model with average_price ~ type + region

# we want to compare the two
anova(model1b, model2d)
Analysis of Variance Table

Model 1: average_price ~ type
Model 2: average_price ~ type + region
  Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
1  18247 1836.7                                  
2  18194 1339.4 53    497.26 127.44 < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

It seems region is significant overall, so we’ll keep it in!

Third Variable

Model2d is our model with average_price ~ type + region, and it explains 0.5473 of the variance in average price. This isn’t really very high, so we can think about adding a third predictor now. Again, we want to remove these variables from our data, and check the residuals.

avocados_remaining_resid <- trimmed_avocados %>%
  add_residuals(model2d) %>%
  select(-c("average_price", "type", "region"))

ggpairs(avocados_remaining_resid) + 
   theme_grey(base_size = 8) # font size of labels

 plot: [1,1] [>----------------------------------------------------------------]  2% est: 0s 
 plot: [1,2] [=>---------------------------------------------------------------]  3% est: 4s 
 plot: [1,3] [==>--------------------------------------------------------------]  5% est: 5s 
 plot: [1,4] [===>-------------------------------------------------------------]  6% est: 4s 
 plot: [1,5] [====>------------------------------------------------------------]  8% est: 4s 
 plot: [1,6] [=====>-----------------------------------------------------------]  9% est: 4s 
 plot: [1,7] [======>----------------------------------------------------------] 11% est: 5s 
 plot: [1,8] [=======>---------------------------------------------------------] 12% est: 5s 
 plot: [2,1] [========>--------------------------------------------------------] 14% est: 5s 
 plot: [2,2] [=========>-------------------------------------------------------] 16% est: 4s 
 plot: [2,3] [==========>------------------------------------------------------] 17% est: 4s 
 plot: [2,4] [===========>-----------------------------------------------------] 19% est: 4s 
 plot: [2,5] [============>----------------------------------------------------] 20% est: 4s 
 plot: [2,6] [=============>---------------------------------------------------] 22% est: 4s 
 plot: [2,7] [==============>--------------------------------------------------] 23% est: 4s 
 plot: [2,8] [===============>-------------------------------------------------] 25% est: 4s 
 plot: [3,1] [================>------------------------------------------------] 27% est: 4s 
 plot: [3,2] [=================>-----------------------------------------------] 28% est: 4s 
 plot: [3,3] [==================>----------------------------------------------] 30% est: 4s 
 plot: [3,4] [===================>---------------------------------------------] 31% est: 4s 
 plot: [3,5] [====================>--------------------------------------------] 33% est: 4s 
 plot: [3,6] [=====================>-------------------------------------------] 34% est: 3s 
 plot: [3,7] [======================>------------------------------------------] 36% est: 3s 
 plot: [3,8] [=======================>-----------------------------------------] 38% est: 3s 
 plot: [4,1] [========================>----------------------------------------] 39% est: 3s 
 plot: [4,2] [=========================>---------------------------------------] 41% est: 3s 
 plot: [4,3] [==========================>--------------------------------------] 42% est: 3s 
 plot: [4,4] [===========================>-------------------------------------] 44% est: 3s 
 plot: [4,5] [============================>------------------------------------] 45% est: 3s 
 plot: [4,6] [=============================>-----------------------------------] 47% est: 3s 
 plot: [4,7] [==============================>----------------------------------] 48% est: 3s 
 plot: [4,8] [===============================>---------------------------------] 50% est: 3s 
 plot: [5,1] [=================================>-------------------------------] 52% est: 3s 
 plot: [5,2] [==================================>------------------------------] 53% est: 3s 
 plot: [5,3] [===================================>-----------------------------] 55% est: 2s 
 plot: [5,4] [====================================>----------------------------] 56% est: 2s 
 plot: [5,5] [=====================================>---------------------------] 58% est: 2s 
 plot: [5,6] [======================================>--------------------------] 59% est: 2s 
 plot: [5,7] [=======================================>-------------------------] 61% est: 2s 
 plot: [5,8] [========================================>------------------------] 62% est: 2s 
 plot: [6,1] [=========================================>-----------------------] 64% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,2] [==========================================>----------------------] 66% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,3] [===========================================>---------------------] 67% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,4] [============================================>--------------------] 69% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,5] [=============================================>-------------------] 70% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,6] [==============================================>------------------] 72% est: 2s 
 plot: [6,7] [===============================================>-----------------] 73% est: 2s 
 plot: [6,8] [================================================>----------------] 75% est: 2s 
 plot: [7,1] [=================================================>---------------] 77% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,2] [==================================================>--------------] 78% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,3] [===================================================>-------------] 80% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,4] [====================================================>------------] 81% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,5] [=====================================================>-----------] 83% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,6] [======================================================>----------] 84% est: 1s 
 plot: [7,7] [=======================================================>---------] 86% est: 1s 
 plot: [7,8] [========================================================>--------] 88% est: 1s 
 plot: [8,1] [=========================================================>-------] 89% est: 1s 
 plot: [8,2] [==========================================================>------] 91% est: 1s 
 plot: [8,3] [===========================================================>-----] 92% est: 1s 
 plot: [8,4] [============================================================>----] 94% est: 0s 
 plot: [8,5] [=============================================================>---] 95% est: 0s 
 plot: [8,6] [==============================================================>--] 97% est: 0s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [8,7] [===============================================================>-] 98% est: 0s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [8,8] [=================================================================]100% est: 0s 
                                                                                             

The next contender variables look to be x_large_bags, year and quarter. Let’s try them out.

model3a <- lm(average_price ~ type + region + x_large_bags, data = trimmed_avocados)
autoplot(model3a)

summary(model3a)

Call:
lm(formula = average_price ~ type + region + x_large_bags, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.10024 -0.16726 -0.01734  0.14591  1.51156 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.311e+00  1.489e-02  88.033  < 2e-16 ***
typeorganic                5.001e-01  4.101e-03 121.953  < 2e-16 ***
regionAtlanta             -2.235e-01  2.086e-02 -10.718  < 2e-16 ***
regionBaltimoreWashington -2.713e-02  2.086e-02  -1.301 0.193298    
regionBoise               -2.128e-01  2.086e-02 -10.204  < 2e-16 ***
regionBoston              -3.023e-02  2.086e-02  -1.449 0.147234    
regionBuffaloRochester    -4.428e-02  2.086e-02  -2.123 0.033774 *  
regionCalifornia          -1.762e-01  2.096e-02  -8.408  < 2e-16 ***
regionCharlotte            4.495e-02  2.086e-02   2.155 0.031177 *  
regionChicago             -4.936e-03  2.086e-02  -0.237 0.812924    
regionCincinnatiDayton    -3.523e-01  2.086e-02 -16.890  < 2e-16 ***
regionColumbus            -3.086e-01  2.086e-02 -14.796  < 2e-16 ***
regionDallasFtWorth       -4.762e-01  2.086e-02 -22.832  < 2e-16 ***
regionDenver              -3.425e-01  2.086e-02 -16.420  < 2e-16 ***
regionDetroit             -2.882e-01  2.087e-02 -13.810  < 2e-16 ***
regionGrandRapids         -5.764e-02  2.086e-02  -2.763 0.005731 ** 
regionGreatLakes          -2.353e-01  2.101e-02 -11.198  < 2e-16 ***
regionHarrisburgScranton  -4.798e-02  2.086e-02  -2.300 0.021451 *  
regionHartfordSpringfield  2.575e-01  2.086e-02  12.347  < 2e-16 ***
regionHouston             -5.137e-01  2.086e-02 -24.628  < 2e-16 ***
regionIndianapolis        -2.475e-01  2.086e-02 -11.867  < 2e-16 ***
regionJacksonville        -5.021e-02  2.086e-02  -2.407 0.016074 *  
regionLasVegas            -1.801e-01  2.086e-02  -8.633  < 2e-16 ***
regionLosAngeles          -3.532e-01  2.092e-02 -16.881  < 2e-16 ***
regionLouisville          -2.745e-01  2.086e-02 -13.160  < 2e-16 ***
regionMiamiFtLauderdale   -1.331e-01  2.086e-02  -6.380 1.81e-10 ***
regionMidsouth            -1.590e-01  2.086e-02  -7.619 2.68e-14 ***
regionNashville           -3.491e-01  2.086e-02 -16.736  < 2e-16 ***
regionNewOrleansMobile    -2.572e-01  2.086e-02 -12.330  < 2e-16 ***
regionNewYork              1.659e-01  2.086e-02   7.954 1.91e-15 ***
regionNortheast            3.834e-02  2.086e-02   1.838 0.066151 .  
regionNorthernNewEngland  -8.377e-02  2.086e-02  -4.017 5.93e-05 ***
regionOrlando             -5.523e-02  2.086e-02  -2.648 0.008111 ** 
regionPhiladelphia         7.097e-02  2.086e-02   3.403 0.000669 ***
regionPhoenixTucson       -3.368e-01  2.086e-02 -16.149  < 2e-16 ***
regionPittsburgh          -1.967e-01  2.086e-02  -9.433  < 2e-16 ***
regionPlains              -1.267e-01  2.086e-02  -6.072 1.29e-09 ***
regionPortland            -2.434e-01  2.086e-02 -11.669  < 2e-16 ***
regionRaleighGreensboro   -6.021e-03  2.086e-02  -0.289 0.772828    
regionRichmondNorfolk     -2.699e-01  2.086e-02 -12.939  < 2e-16 ***
regionRoanoke             -3.132e-01  2.086e-02 -15.015  < 2e-16 ***
regionSacramento           6.020e-02  2.086e-02   2.886 0.003904 ** 
regionSanDiego            -1.631e-01  2.086e-02  -7.819 5.64e-15 ***
regionSanFrancisco         2.428e-01  2.086e-02  11.642  < 2e-16 ***
regionSeattle             -1.185e-01  2.086e-02  -5.682 1.35e-08 ***
regionSouthCarolina       -1.581e-01  2.086e-02  -7.581 3.59e-14 ***
regionSouthCentral        -4.650e-01  2.088e-02 -22.268  < 2e-16 ***
regionSoutheast           -1.680e-01  2.088e-02  -8.046 9.10e-16 ***
regionSpokane             -1.154e-01  2.086e-02  -5.531 3.22e-08 ***
regionStLouis             -1.308e-01  2.086e-02  -6.270 3.69e-10 ***
regionSyracuse            -4.071e-02  2.086e-02  -1.952 0.050993 .  
regionTampa               -1.526e-01  2.086e-02  -7.315 2.68e-13 ***
regionTotalUS             -2.852e-01  2.255e-02 -12.648  < 2e-16 ***
regionWest                -2.904e-01  2.086e-02 -13.922  < 2e-16 ***
regionWestTexNewMexico    -2.976e-01  2.090e-02 -14.238  < 2e-16 ***
x_large_bags               6.810e-07  1.351e-07   5.040 4.70e-07 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2711 on 18193 degrees of freedom
Multiple R-squared:  0.548, Adjusted R-squared:  0.5466 
F-statistic:   401 on 55 and 18193 DF,  p-value: < 2.2e-16
model3b <- lm(average_price ~ type + region + year, data = trimmed_avocados)
autoplot(model3b)

summary(model3b)

Call:
lm(formula = average_price ~ type + region + year, data = trimmed_avocados)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.1532 -0.1497 -0.0060  0.1419  1.4849 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.282672   0.014600  87.857  < 2e-16 ***
typeorganic                0.495933   0.003859 128.501  < 2e-16 ***
regionAtlanta             -0.223077   0.020052 -11.125  < 2e-16 ***
regionBaltimoreWashington -0.026805   0.020052  -1.337 0.181322    
regionBoise               -0.212899   0.020052 -10.617  < 2e-16 ***
regionBoston              -0.030148   0.020052  -1.503 0.132735    
regionBuffaloRochester    -0.044201   0.020052  -2.204 0.027515 *  
regionCalifornia          -0.165710   0.020052  -8.264  < 2e-16 ***
regionCharlotte            0.045000   0.020052   2.244 0.024835 *  
regionChicago             -0.004260   0.020052  -0.212 0.831748    
regionCincinnatiDayton    -0.351834   0.020052 -17.546  < 2e-16 ***
regionColumbus            -0.308254   0.020052 -15.373  < 2e-16 ***
regionDallasFtWorth       -0.475444   0.020052 -23.710  < 2e-16 ***
regionDenver              -0.342456   0.020052 -17.078  < 2e-16 ***
regionDetroit             -0.284941   0.020052 -14.210  < 2e-16 ***
regionGrandRapids         -0.056036   0.020052  -2.794 0.005204 ** 
regionGreatLakes          -0.222485   0.020052 -11.095  < 2e-16 ***
regionHarrisburgScranton  -0.047751   0.020052  -2.381 0.017259 *  
regionHartfordSpringfield  0.257604   0.020052  12.847  < 2e-16 ***
regionHouston             -0.513107   0.020052 -25.589  < 2e-16 ***
regionIndianapolis        -0.247041   0.020052 -12.320  < 2e-16 ***
regionJacksonville        -0.050089   0.020052  -2.498 0.012501 *  
regionLasVegas            -0.180118   0.020052  -8.982  < 2e-16 ***
regionLosAngeles          -0.345030   0.020052 -17.207  < 2e-16 ***
regionLouisville          -0.274349   0.020052 -13.682  < 2e-16 ***
regionMiamiFtLauderdale   -0.132544   0.020052  -6.610 3.95e-11 ***
regionMidsouth            -0.156272   0.020052  -7.793 6.88e-15 ***
regionNashville           -0.348935   0.020052 -17.401  < 2e-16 ***
regionNewOrleansMobile    -0.256243   0.020052 -12.779  < 2e-16 ***
regionNewYork              0.166538   0.020052   8.305  < 2e-16 ***
regionNortheast            0.040888   0.020052   2.039 0.041459 *  
regionNorthernNewEngland  -0.083639   0.020052  -4.171 3.05e-05 ***
regionOrlando             -0.054822   0.020052  -2.734 0.006263 ** 
regionPhiladelphia         0.071095   0.020052   3.545 0.000393 ***
regionPhoenixTucson       -0.336598   0.020052 -16.786  < 2e-16 ***
regionPittsburgh          -0.196716   0.020052  -9.810  < 2e-16 ***
regionPlains              -0.124527   0.020052  -6.210 5.41e-10 ***
regionPortland            -0.243314   0.020052 -12.134  < 2e-16 ***
regionRaleighGreensboro   -0.005917   0.020052  -0.295 0.767930    
regionRichmondNorfolk     -0.269704   0.020052 -13.450  < 2e-16 ***
regionRoanoke             -0.313107   0.020052 -15.615  < 2e-16 ***
regionSacramento           0.060533   0.020052   3.019 0.002542 ** 
regionSanDiego            -0.162870   0.020052  -8.122 4.86e-16 ***
regionSanFrancisco         0.243166   0.020052  12.127  < 2e-16 ***
regionSeattle             -0.118462   0.020052  -5.908 3.53e-09 ***
regionSouthCarolina       -0.157751   0.020052  -7.867 3.83e-15 ***
regionSouthCentral        -0.459793   0.020052 -22.930  < 2e-16 ***
regionSoutheast           -0.163018   0.020052  -8.130 4.58e-16 ***
regionSpokane             -0.115444   0.020052  -5.757 8.69e-09 ***
regionStLouis             -0.130414   0.020052  -6.504 8.04e-11 ***
regionSyracuse            -0.040710   0.020052  -2.030 0.042350 *  
regionTampa               -0.152189   0.020052  -7.590 3.36e-14 ***
regionTotalUS             -0.242012   0.020052 -12.069  < 2e-16 ***
regionWest                -0.288817   0.020052 -14.403  < 2e-16 ***
regionWestTexNewMexico    -0.296552   0.020097 -14.756  < 2e-16 ***
year2016                  -0.036970   0.004920  -7.515 5.96e-14 ***
year2017                   0.139555   0.004897  28.500  < 2e-16 ***
year2018                  -0.028078   0.008033  -3.495 0.000475 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2607 on 18191 degrees of freedom
Multiple R-squared:  0.5822,    Adjusted R-squared:  0.5809 
F-statistic: 444.8 on 57 and 18191 DF,  p-value: < 2.2e-16
model3c <- lm(average_price ~ type + region + quarter, data = trimmed_avocados)
autoplot(model3c)

summary(model3c)

Call:
lm(formula = average_price ~ type + region + quarter, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.06767 -0.15971 -0.01185  0.14629  1.54411 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.213689   0.014517  83.603  < 2e-16 ***
typeorganic                0.495911   0.003835 129.296  < 2e-16 ***
regionAtlanta             -0.223077   0.019928 -11.194  < 2e-16 ***
regionBaltimoreWashington -0.026805   0.019928  -1.345 0.178619    
regionBoise               -0.212899   0.019928 -10.683  < 2e-16 ***
regionBoston              -0.030148   0.019928  -1.513 0.130339    
regionBuffaloRochester    -0.044201   0.019928  -2.218 0.026565 *  
regionCalifornia          -0.165710   0.019928  -8.315  < 2e-16 ***
regionCharlotte            0.045000   0.019928   2.258 0.023950 *  
regionChicago             -0.004260   0.019928  -0.214 0.830716    
regionCincinnatiDayton    -0.351834   0.019928 -17.655  < 2e-16 ***
regionColumbus            -0.308254   0.019928 -15.468  < 2e-16 ***
regionDallasFtWorth       -0.475444   0.019928 -23.858  < 2e-16 ***
regionDenver              -0.342456   0.019928 -17.185  < 2e-16 ***
regionDetroit             -0.284941   0.019928 -14.298  < 2e-16 ***
regionGrandRapids         -0.056036   0.019928  -2.812 0.004931 ** 
regionGreatLakes          -0.222485   0.019928 -11.164  < 2e-16 ***
regionHarrisburgScranton  -0.047751   0.019928  -2.396 0.016577 *  
regionHartfordSpringfield  0.257604   0.019928  12.927  < 2e-16 ***
regionHouston             -0.513107   0.019928 -25.748  < 2e-16 ***
regionIndianapolis        -0.247041   0.019928 -12.397  < 2e-16 ***
regionJacksonville        -0.050089   0.019928  -2.513 0.011963 *  
regionLasVegas            -0.180118   0.019928  -9.038  < 2e-16 ***
regionLosAngeles          -0.345030   0.019928 -17.314  < 2e-16 ***
regionLouisville          -0.274349   0.019928 -13.767  < 2e-16 ***
regionMiamiFtLauderdale   -0.132544   0.019928  -6.651 2.99e-11 ***
regionMidsouth            -0.156272   0.019928  -7.842 4.69e-15 ***
regionNashville           -0.348935   0.019928 -17.510  < 2e-16 ***
regionNewOrleansMobile    -0.256243   0.019928 -12.858  < 2e-16 ***
regionNewYork              0.166538   0.019928   8.357  < 2e-16 ***
regionNortheast            0.040888   0.019928   2.052 0.040208 *  
regionNorthernNewEngland  -0.083639   0.019928  -4.197 2.72e-05 ***
regionOrlando             -0.054822   0.019928  -2.751 0.005947 ** 
regionPhiladelphia         0.071095   0.019928   3.568 0.000361 ***
regionPhoenixTucson       -0.336598   0.019928 -16.891  < 2e-16 ***
regionPittsburgh          -0.196716   0.019928  -9.871  < 2e-16 ***
regionPlains              -0.124527   0.019928  -6.249 4.23e-10 ***
regionPortland            -0.243314   0.019928 -12.210  < 2e-16 ***
regionRaleighGreensboro   -0.005917   0.019928  -0.297 0.766527    
regionRichmondNorfolk     -0.269704   0.019928 -13.534  < 2e-16 ***
regionRoanoke             -0.313107   0.019928 -15.712  < 2e-16 ***
regionSacramento           0.060533   0.019928   3.038 0.002389 ** 
regionSanDiego            -0.162870   0.019928  -8.173 3.21e-16 ***
regionSanFrancisco         0.243166   0.019928  12.202  < 2e-16 ***
regionSeattle             -0.118462   0.019928  -5.944 2.82e-09 ***
regionSouthCarolina       -0.157751   0.019928  -7.916 2.59e-15 ***
regionSouthCentral        -0.459793   0.019928 -23.073  < 2e-16 ***
regionSoutheast           -0.163018   0.019928  -8.180 3.02e-16 ***
regionSpokane             -0.115444   0.019928  -5.793 7.03e-09 ***
regionStLouis             -0.130414   0.019928  -6.544 6.14e-11 ***
regionSyracuse            -0.040710   0.019928  -2.043 0.041082 *  
regionTampa               -0.152189   0.019928  -7.637 2.33e-14 ***
regionTotalUS             -0.242012   0.019928 -12.144  < 2e-16 ***
regionWest                -0.288817   0.019928 -14.493  < 2e-16 ***
regionWestTexNewMexico    -0.297141   0.019973 -14.877  < 2e-16 ***
quarter2                   0.068479   0.005303  12.912  < 2e-16 ***
quarter3                   0.206308   0.005303  38.906  < 2e-16 ***
quarter4                   0.152007   0.005265  28.869  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2591 on 18191 degrees of freedom
Multiple R-squared:  0.5874,    Adjusted R-squared:  0.5861 
F-statistic: 454.3 on 57 and 18191 DF,  p-value: < 2.2e-16

So model3c with type, region and quarter wins out here. Everything still looks reasonable with the diagnostics, perhaps some mild heteroscedasticity.

Fourth Variable

Remember with two predictors, our R^2 variable was up at 0.5473. Now, with three predictors, we are at 0.5874. Ok, that seems reasonable as an improvement. So let’s see how much improvement we get by adding a fourth variable. Again, check the residuals to see which ones we should try add.

avocados_remaining_resid <- trimmed_avocados %>%
  add_residuals(model3c) %>%
  select(-c("average_price", "type", "region", "quarter"))

ggpairs(avocados_remaining_resid) + 
   theme_grey(base_size = 8) # font size of labels

 plot: [1,1] [>----------------------------------------------------------------]  2% est: 0s 
 plot: [1,2] [==>--------------------------------------------------------------]  4% est: 4s 
 plot: [1,3] [===>-------------------------------------------------------------]  6% est: 4s 
 plot: [1,4] [====>------------------------------------------------------------]  8% est: 5s 
 plot: [1,5] [======>----------------------------------------------------------] 10% est: 5s 
 plot: [1,6] [=======>---------------------------------------------------------] 12% est: 5s 
 plot: [1,7] [========>--------------------------------------------------------] 14% est: 5s 
 plot: [2,1] [==========>------------------------------------------------------] 16% est: 4s 
 plot: [2,2] [===========>-----------------------------------------------------] 18% est: 5s 
 plot: [2,3] [============>----------------------------------------------------] 20% est: 4s 
 plot: [2,4] [==============>--------------------------------------------------] 22% est: 4s 
 plot: [2,5] [===============>-------------------------------------------------] 24% est: 4s 
 plot: [2,6] [================>------------------------------------------------] 27% est: 4s 
 plot: [2,7] [==================>----------------------------------------------] 29% est: 4s 
 plot: [3,1] [===================>---------------------------------------------] 31% est: 4s 
 plot: [3,2] [====================>--------------------------------------------] 33% est: 4s 
 plot: [3,3] [======================>------------------------------------------] 35% est: 4s 
 plot: [3,4] [=======================>-----------------------------------------] 37% est: 4s 
 plot: [3,5] [========================>----------------------------------------] 39% est: 4s 
 plot: [3,6] [==========================>--------------------------------------] 41% est: 4s 
 plot: [3,7] [===========================>-------------------------------------] 43% est: 4s 
 plot: [4,1] [============================>------------------------------------] 45% est: 4s 
 plot: [4,2] [==============================>----------------------------------] 47% est: 4s 
 plot: [4,3] [===============================>---------------------------------] 49% est: 3s 
 plot: [4,4] [================================>--------------------------------] 51% est: 3s 
 plot: [4,5] [=================================>-------------------------------] 53% est: 3s 
 plot: [4,6] [===================================>-----------------------------] 55% est: 3s 
 plot: [4,7] [====================================>----------------------------] 57% est: 3s 
 plot: [5,1] [=====================================>---------------------------] 59% est: 3s 
 plot: [5,2] [=======================================>-------------------------] 61% est: 2s 
 plot: [5,3] [========================================>------------------------] 63% est: 2s 
 plot: [5,4] [=========================================>-----------------------] 65% est: 2s 
 plot: [5,5] [===========================================>---------------------] 67% est: 2s 
 plot: [5,6] [============================================>--------------------] 69% est: 2s 
 plot: [5,7] [=============================================>-------------------] 71% est: 2s 
 plot: [6,1] [===============================================>-----------------] 73% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,2] [================================================>----------------] 76% est: 2s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,3] [=================================================>---------------] 78% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,4] [===================================================>-------------] 80% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,5] [====================================================>------------] 82% est: 1s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [6,6] [=====================================================>-----------] 84% est: 1s 
 plot: [6,7] [=======================================================>---------] 86% est: 1s 
 plot: [7,1] [========================================================>--------] 88% est: 1s 
 plot: [7,2] [=========================================================>-------] 90% est: 1s 
 plot: [7,3] [===========================================================>-----] 92% est: 1s 
 plot: [7,4] [============================================================>----] 94% est: 0s 
 plot: [7,5] [=============================================================>---] 96% est: 0s 
 plot: [7,6] [===============================================================>-] 98% est: 0s `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

 plot: [7,7] [=================================================================]100% est: 0s 
                                                                                             

The contender variables here are x_large_bags and year, so let’s try them out.

model4a <- lm(average_price ~ type + region + quarter + x_large_bags, data = trimmed_avocados)
autoplot(model4a)

summary(model4a)

Call:
lm(formula = average_price ~ type + region + quarter + x_large_bags, 
    data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.06889 -0.16013 -0.01154  0.14553  1.54291 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.212e+00  1.451e-02  83.493  < 2e-16 ***
typeorganic                4.998e-01  3.916e-03 127.614  < 2e-16 ***
regionAtlanta             -2.235e-01  1.992e-02 -11.222  < 2e-16 ***
regionBaltimoreWashington -2.711e-02  1.992e-02  -1.361 0.173535    
regionBoise               -2.128e-01  1.992e-02 -10.687  < 2e-16 ***
regionBoston              -3.022e-02  1.992e-02  -1.518 0.129137    
regionBuffaloRochester    -4.427e-02  1.992e-02  -2.223 0.026233 *  
regionCalifornia          -1.753e-01  2.002e-02  -8.759  < 2e-16 ***
regionCharlotte            4.495e-02  1.992e-02   2.257 0.024015 *  
regionChicago             -4.877e-03  1.992e-02  -0.245 0.806549    
regionCincinnatiDayton    -3.522e-01  1.992e-02 -17.686  < 2e-16 ***
regionColumbus            -3.086e-01  1.992e-02 -15.494  < 2e-16 ***
regionDallasFtWorth       -4.762e-01  1.992e-02 -23.908  < 2e-16 ***
regionDenver              -3.425e-01  1.992e-02 -17.196  < 2e-16 ***
regionDetroit             -2.879e-01  1.993e-02 -14.449  < 2e-16 ***
regionGrandRapids         -5.750e-02  1.992e-02  -2.887 0.003898 ** 
regionGreatLakes          -2.342e-01  2.006e-02 -11.671  < 2e-16 ***
regionHarrisburgScranton  -4.796e-02  1.992e-02  -2.408 0.016054 *  
regionHartfordSpringfield  2.575e-01  1.992e-02  12.931  < 2e-16 ***
regionHouston             -5.136e-01  1.992e-02 -25.789  < 2e-16 ***
regionIndianapolis        -2.475e-01  1.992e-02 -12.426  < 2e-16 ***
regionJacksonville        -5.020e-02  1.992e-02  -2.521 0.011720 *  
regionLasVegas            -1.801e-01  1.992e-02  -9.041  < 2e-16 ***
regionLosAngeles          -3.524e-01  1.998e-02 -17.644  < 2e-16 ***
regionLouisville          -2.745e-01  1.992e-02 -13.781  < 2e-16 ***
regionMiamiFtLauderdale   -1.330e-01  1.992e-02  -6.679 2.47e-11 ***
regionMidsouth            -1.587e-01  1.992e-02  -7.967 1.72e-15 ***
regionNashville           -3.491e-01  1.992e-02 -17.527  < 2e-16 ***
regionNewOrleansMobile    -2.571e-01  1.992e-02 -12.909  < 2e-16 ***
regionNewYork              1.660e-01  1.992e-02   8.333  < 2e-16 ***
regionNortheast            3.856e-02  1.992e-02   1.936 0.052939 .  
regionNorthernNewEngland  -8.376e-02  1.992e-02  -4.206 2.61e-05 ***
regionOrlando             -5.519e-02  1.992e-02  -2.771 0.005592 ** 
regionPhiladelphia         7.098e-02  1.992e-02   3.564 0.000366 ***
regionPhoenixTucson       -3.368e-01  1.992e-02 -16.911  < 2e-16 ***
regionPittsburgh          -1.967e-01  1.992e-02  -9.879  < 2e-16 ***
regionPlains              -1.265e-01  1.992e-02  -6.350 2.20e-10 ***
regionPortland            -2.434e-01  1.992e-02 -12.220  < 2e-16 ***
regionRaleighGreensboro   -6.012e-03  1.992e-02  -0.302 0.762753    
regionRichmondNorfolk     -2.699e-01  1.992e-02 -13.549  < 2e-16 ***
regionRoanoke             -3.132e-01  1.992e-02 -15.725  < 2e-16 ***
regionSacramento           6.023e-02  1.992e-02   3.024 0.002497 ** 
regionSanDiego            -1.631e-01  1.992e-02  -8.187 2.85e-16 ***
regionSanFrancisco         2.429e-01  1.992e-02  12.194  < 2e-16 ***
regionSeattle             -1.185e-01  1.992e-02  -5.950 2.72e-09 ***
regionSouthCarolina       -1.581e-01  1.992e-02  -7.938 2.18e-15 ***
regionSouthCentral        -4.646e-01  1.994e-02 -23.297  < 2e-16 ***
regionSoutheast           -1.676e-01  1.994e-02  -8.404  < 2e-16 ***
regionSpokane             -1.154e-01  1.992e-02  -5.793 7.02e-09 ***
regionStLouis             -1.307e-01  1.992e-02  -6.565 5.35e-11 ***
regionSyracuse            -4.071e-02  1.992e-02  -2.044 0.040974 *  
regionTampa               -1.525e-01  1.992e-02  -7.659 1.96e-14 ***
regionTotalUS             -2.814e-01  2.153e-02 -13.068  < 2e-16 ***
regionWest                -2.903e-01  1.992e-02 -14.573  < 2e-16 ***
regionWestTexNewMexico    -2.976e-01  1.996e-02 -14.910  < 2e-16 ***
quarter2                   6.806e-02  5.301e-03  12.839  < 2e-16 ***
quarter3                   2.055e-01  5.302e-03  38.761  < 2e-16 ***
quarter4                   1.527e-01  5.264e-03  29.001  < 2e-16 ***
x_large_bags               6.215e-07  1.292e-07   4.810 1.52e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2589 on 18190 degrees of freedom
Multiple R-squared:  0.5879,    Adjusted R-squared:  0.5866 
F-statistic: 447.4 on 58 and 18190 DF,  p-value: < 2.2e-16
model4b <- lm(average_price ~ type + region + quarter + year, data = trimmed_avocados)
autoplot(model4b)

summary(model4b)

Call:
lm(formula = average_price ~ type + region + quarter + year, 
    data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.03683 -0.14588 -0.00412  0.14386  1.43930 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.167184   0.014290  81.677  < 2e-16 ***
typeorganic                0.495930   0.003675 134.950  < 2e-16 ***
regionAtlanta             -0.223077   0.019094 -11.683  < 2e-16 ***
regionBaltimoreWashington -0.026805   0.019094  -1.404 0.160383    
regionBoise               -0.212899   0.019094 -11.150  < 2e-16 ***
regionBoston              -0.030148   0.019094  -1.579 0.114368    
regionBuffaloRochester    -0.044201   0.019094  -2.315 0.020627 *  
regionCalifornia          -0.165710   0.019094  -8.679  < 2e-16 ***
regionCharlotte            0.045000   0.019094   2.357 0.018445 *  
regionChicago             -0.004260   0.019094  -0.223 0.823439    
regionCincinnatiDayton    -0.351834   0.019094 -18.427  < 2e-16 ***
regionColumbus            -0.308254   0.019094 -16.144  < 2e-16 ***
regionDallasFtWorth       -0.475444   0.019094 -24.900  < 2e-16 ***
regionDenver              -0.342456   0.019094 -17.935  < 2e-16 ***
regionDetroit             -0.284941   0.019094 -14.923  < 2e-16 ***
regionGrandRapids         -0.056036   0.019094  -2.935 0.003342 ** 
regionGreatLakes          -0.222485   0.019094 -11.652  < 2e-16 ***
regionHarrisburgScranton  -0.047751   0.019094  -2.501 0.012397 *  
regionHartfordSpringfield  0.257604   0.019094  13.491  < 2e-16 ***
regionHouston             -0.513107   0.019094 -26.873  < 2e-16 ***
regionIndianapolis        -0.247041   0.019094 -12.938  < 2e-16 ***
regionJacksonville        -0.050089   0.019094  -2.623 0.008716 ** 
regionLasVegas            -0.180118   0.019094  -9.433  < 2e-16 ***
regionLosAngeles          -0.345030   0.019094 -18.070  < 2e-16 ***
regionLouisville          -0.274349   0.019094 -14.368  < 2e-16 ***
regionMiamiFtLauderdale   -0.132544   0.019094  -6.942 4.00e-12 ***
regionMidsouth            -0.156272   0.019094  -8.184 2.91e-16 ***
regionNashville           -0.348935   0.019094 -18.275  < 2e-16 ***
regionNewOrleansMobile    -0.256243   0.019094 -13.420  < 2e-16 ***
regionNewYork              0.166538   0.019094   8.722  < 2e-16 ***
regionNortheast            0.040888   0.019094   2.141 0.032255 *  
regionNorthernNewEngland  -0.083639   0.019094  -4.380 1.19e-05 ***
regionOrlando             -0.054822   0.019094  -2.871 0.004094 ** 
regionPhiladelphia         0.071095   0.019094   3.723 0.000197 ***
regionPhoenixTucson       -0.336598   0.019094 -17.629  < 2e-16 ***
regionPittsburgh          -0.196716   0.019094 -10.303  < 2e-16 ***
regionPlains              -0.124527   0.019094  -6.522 7.13e-11 ***
regionPortland            -0.243314   0.019094 -12.743  < 2e-16 ***
regionRaleighGreensboro   -0.005917   0.019094  -0.310 0.756641    
regionRichmondNorfolk     -0.269704   0.019094 -14.125  < 2e-16 ***
regionRoanoke             -0.313107   0.019094 -16.398  < 2e-16 ***
regionSacramento           0.060533   0.019094   3.170 0.001526 ** 
regionSanDiego            -0.162870   0.019094  -8.530  < 2e-16 ***
regionSanFrancisco         0.243166   0.019094  12.735  < 2e-16 ***
regionSeattle             -0.118462   0.019094  -6.204 5.62e-10 ***
regionSouthCarolina       -0.157751   0.019094  -8.262  < 2e-16 ***
regionSouthCentral        -0.459793   0.019094 -24.081  < 2e-16 ***
regionSoutheast           -0.163018   0.019094  -8.538  < 2e-16 ***
regionSpokane             -0.115444   0.019094  -6.046 1.51e-09 ***
regionStLouis             -0.130414   0.019094  -6.830 8.75e-12 ***
regionSyracuse            -0.040710   0.019094  -2.132 0.033011 *  
regionTampa               -0.152189   0.019094  -7.971 1.67e-15 ***
regionTotalUS             -0.242012   0.019094 -12.675  < 2e-16 ***
regionWest                -0.288817   0.019094 -15.126  < 2e-16 ***
regionWestTexNewMexico    -0.296624   0.019137 -15.500  < 2e-16 ***
quarter2                   0.081121   0.005410  14.996  < 2e-16 ***
quarter3                   0.218901   0.005409  40.471  < 2e-16 ***
quarter4                   0.161972   0.005376  30.130  < 2e-16 ***
year2016                  -0.036978   0.004684  -7.894 3.10e-15 ***
year2017                   0.138658   0.004663  29.735  < 2e-16 ***
year2018                   0.087412   0.008334  10.488  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2482 on 18188 degrees of freedom
Multiple R-squared:  0.6213,    Adjusted R-squared:   0.62 
F-statistic: 497.3 on 60 and 18188 DF,  p-value: < 2.2e-16

Hmm, model4b with type, region, quarter and year wins here. And it has improved our model performance from 0.5874 (with three predictors) to 0.6213. That’s quite good.

Fifth Variable

We are likely now pursuing variables with rather limited explanatory power, but let’s check for one more main effect, and see how much predictive power it gives us.

avocados_remaining_resid <- trimmed_avocados %>%
  add_residuals(model4b) %>%
  select(-c("average_price", "type", "region", "quarter", "year"))

ggpairs(avocados_remaining_resid) + 
   theme_grey(base_size = 8) # font size of labels

 plot: [1,1] [=>---------------------------------------------------------------]  3% est: 0s 
 plot: [1,2] [===>-------------------------------------------------------------]  6% est: 2s 
 plot: [1,3] [====>------------------------------------------------------------]  8% est: 2s 
 plot: [1,4] [======>----------------------------------------------------------] 11% est: 3s 
 plot: [1,5] [========>--------------------------------------------------------] 14% est: 2s 
 plot: [1,6] [==========>------------------------------------------------------] 17% est: 2s 
 plot: [2,1] [============>----------------------------------------------------] 19% est: 2s 
 plot: [2,2] [=============>---------------------------------------------------] 22% est: 2s 
 plot: [2,3] [===============>-------------------------------------------------] 25% est: 2s 
 plot: [2,4] [=================>-----------------------------------------------] 28% est: 2s 
 plot: [2,5] [===================>---------------------------------------------] 31% est: 2s 
 plot: [2,6] [=====================>-------------------------------------------] 33% est: 2s 
 plot: [3,1] [======================>------------------------------------------] 36% est: 2s 
 plot: [3,2] [========================>----------------------------------------] 39% est: 2s 
 plot: [3,3] [==========================>--------------------------------------] 42% est: 2s 
 plot: [3,4] [============================>------------------------------------] 44% est: 2s 
 plot: [3,5] [==============================>----------------------------------] 47% est: 2s 
 plot: [3,6] [===============================>---------------------------------] 50% est: 1s 
 plot: [4,1] [=================================>-------------------------------] 53% est: 1s 
 plot: [4,2] [===================================>-----------------------------] 56% est: 1s 
 plot: [4,3] [=====================================>---------------------------] 58% est: 1s 
 plot: [4,4] [=======================================>-------------------------] 61% est: 1s 
 plot: [4,5] [=========================================>-----------------------] 64% est: 1s 
 plot: [4,6] [==========================================>----------------------] 67% est: 1s 
 plot: [5,1] [============================================>--------------------] 69% est: 1s 
 plot: [5,2] [==============================================>------------------] 72% est: 1s 
 plot: [5,3] [================================================>----------------] 75% est: 1s 
 plot: [5,4] [==================================================>--------------] 78% est: 1s 
 plot: [5,5] [===================================================>-------------] 81% est: 1s 
 plot: [5,6] [=====================================================>-----------] 83% est: 0s 
 plot: [6,1] [=======================================================>---------] 86% est: 0s 
 plot: [6,2] [=========================================================>-------] 89% est: 0s 
 plot: [6,3] [===========================================================>-----] 92% est: 0s 
 plot: [6,4] [============================================================>----] 94% est: 0s 
 plot: [6,5] [==============================================================>--] 97% est: 0s 
 plot: [6,6] [=================================================================]100% est: 0s 
                                                                                             

It looks like x_large_bags is the remaining contender, let’s check it out!

model5 <- lm(average_price ~ type + region + quarter + year + x_large_bags, data = trimmed_avocados)
autoplot(model5)

summary(model5)

Call:
lm(formula = average_price ~ type + region + quarter + year + 
    x_large_bags, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.03610 -0.14545 -0.00439  0.14420  1.43907 

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.167e+00  1.429e-02  81.687  < 2e-16 ***
typeorganic                4.982e-01  3.755e-03 132.674  < 2e-16 ***
regionAtlanta             -2.233e-01  1.909e-02 -11.698  < 2e-16 ***
regionBaltimoreWashington -2.698e-02  1.909e-02  -1.413 0.157614    
regionBoise               -2.129e-01  1.909e-02 -11.151  < 2e-16 ***
regionBoston              -3.019e-02  1.909e-02  -1.582 0.113769    
regionBuffaloRochester    -4.424e-02  1.909e-02  -2.318 0.020485 *  
regionCalifornia          -1.713e-01  1.919e-02  -8.925  < 2e-16 ***
regionCharlotte            4.497e-02  1.909e-02   2.356 0.018493 *  
regionChicago             -4.616e-03  1.909e-02  -0.242 0.808941    
regionCincinnatiDayton    -3.521e-01  1.909e-02 -18.442  < 2e-16 ***
regionColumbus            -3.084e-01  1.909e-02 -16.157  < 2e-16 ***
regionDallasFtWorth       -4.759e-01  1.909e-02 -24.926  < 2e-16 ***
regionDenver              -3.425e-01  1.909e-02 -17.940  < 2e-16 ***
regionDetroit             -2.866e-01  1.910e-02 -15.008  < 2e-16 ***
regionGrandRapids         -5.688e-02  1.909e-02  -2.979 0.002894 ** 
regionGreatLakes          -2.292e-01  1.923e-02 -11.918  < 2e-16 ***
regionHarrisburgScranton  -4.787e-02  1.909e-02  -2.508 0.012166 *  
regionHartfordSpringfield  2.576e-01  1.909e-02  13.492  < 2e-16 ***
regionHouston             -5.134e-01  1.909e-02 -26.894  < 2e-16 ***
regionIndianapolis        -2.473e-01  1.909e-02 -12.954  < 2e-16 ***
regionJacksonville        -5.015e-02  1.909e-02  -2.627 0.008615 ** 
regionLasVegas            -1.801e-01  1.909e-02  -9.434  < 2e-16 ***
regionLosAngeles          -3.493e-01  1.915e-02 -18.243  < 2e-16 ***
regionLouisville          -2.744e-01  1.909e-02 -14.375  < 2e-16 ***
regionMiamiFtLauderdale   -1.328e-01  1.909e-02  -6.958 3.58e-12 ***
regionMidsouth            -1.577e-01  1.910e-02  -8.257  < 2e-16 ***
regionNashville           -3.490e-01  1.909e-02 -18.282  < 2e-16 ***
regionNewOrleansMobile    -2.567e-01  1.909e-02 -13.448  < 2e-16 ***
regionNewYork              1.662e-01  1.909e-02   8.706  < 2e-16 ***
regionNortheast            3.955e-02  1.910e-02   2.071 0.038381 *  
regionNorthernNewEngland  -8.371e-02  1.909e-02  -4.385 1.17e-05 ***
regionOrlando             -5.503e-02  1.909e-02  -2.883 0.003945 ** 
regionPhiladelphia         7.103e-02  1.909e-02   3.721 0.000199 ***
regionPhoenixTucson       -3.367e-01  1.909e-02 -17.638  < 2e-16 ***
regionPittsburgh          -1.967e-01  1.909e-02 -10.305  < 2e-16 ***
regionPlains              -1.257e-01  1.909e-02  -6.581 4.80e-11 ***
regionPortland            -2.434e-01  1.909e-02 -12.748  < 2e-16 ***
regionRaleighGreensboro   -5.972e-03  1.909e-02  -0.313 0.754415    
regionRichmondNorfolk     -2.698e-01  1.909e-02 -14.132  < 2e-16 ***
regionRoanoke             -3.131e-01  1.909e-02 -16.404  < 2e-16 ***
regionSacramento           6.036e-02  1.909e-02   3.162 0.001571 ** 
regionSanDiego            -1.630e-01  1.909e-02  -8.537  < 2e-16 ***
regionSanFrancisco         2.430e-01  1.909e-02  12.728  < 2e-16 ***
regionSeattle             -1.185e-01  1.909e-02  -6.207 5.52e-10 ***
regionSouthCarolina       -1.579e-01  1.909e-02  -8.274  < 2e-16 ***
regionSouthCentral        -4.625e-01  1.911e-02 -24.199  < 2e-16 ***
regionSoutheast           -1.656e-01  1.911e-02  -8.667  < 2e-16 ***
regionSpokane             -1.154e-01  1.909e-02  -6.045 1.52e-09 ***
regionStLouis             -1.306e-01  1.909e-02  -6.842 8.08e-12 ***
regionSyracuse            -4.071e-02  1.909e-02  -2.132 0.032984 *  
regionTampa               -1.524e-01  1.909e-02  -7.983 1.52e-15 ***
regionTotalUS             -2.647e-01  2.066e-02 -12.815  < 2e-16 ***
regionWest                -2.897e-01  1.909e-02 -15.171  < 2e-16 ***
regionWestTexNewMexico    -2.969e-01  1.913e-02 -15.518  < 2e-16 ***
quarter2                   8.058e-02  5.412e-03  14.891  < 2e-16 ***
quarter3                   2.181e-01  5.414e-03  40.293  < 2e-16 ***
quarter4                   1.621e-01  5.375e-03  30.154  < 2e-16 ***
year2016                  -3.791e-02  4.695e-03  -8.075 7.16e-16 ***
year2017                   1.375e-01  4.680e-03  29.381  < 2e-16 ***
year2018                   8.547e-02  8.360e-03  10.223  < 2e-16 ***
x_large_bags               3.583e-07  1.246e-07   2.877 0.004025 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2482 on 18187 degrees of freedom
Multiple R-squared:  0.6214,    Adjusted R-squared:  0.6202 
F-statistic: 489.4 on 61 and 18187 DF,  p-value: < 2.2e-16

Overall, we still have some heterscedasticity and deviations from normality in the residuals. In terms of our regression summary, it is a significant explanatory variable, and it is significant. But hmmm… with four predictors, our overall R^2 was 0.6213, and now with five we’ve only reached 0.6214. Given that there is no real increase in explanatory performance, even though it’s significant, we might want to remove it. Let’s do this now.

It’s also clear we aren’t gaining anything by adding predictors. The final thing we can do is test for interactions.

Pair interaction

Let’s now think about possible pair interactions: for four main effect variables (type + region + quarter + year), so we have six possible pair interactions. Let’s test them out.

type:region type:quarter type:year region:quarter region:year quarter:year Let’s test these now:

model5pa <- lm(average_price ~ type + region + quarter + year + type:region, data = trimmed_avocados)
summary(model5pa)

Call:
lm(formula = average_price ~ type + region + quarter + year + 
    type:region, data = trimmed_avocados)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.0082 -0.1335 -0.0024  0.1335  1.4799 

Coefficients:
                                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)                            1.202843   0.018542  64.870  < 2e-16 ***
typeorganic                            0.424556   0.025580  16.597  < 2e-16 ***
regionAtlanta                         -0.279941   0.025580 -10.944  < 2e-16 ***
regionBaltimoreWashington             -0.004556   0.025580  -0.178 0.858635    
regionBoise                           -0.272722   0.025580 -10.661  < 2e-16 ***
regionBoston                          -0.044379   0.025580  -1.735 0.082778 .  
regionBuffaloRochester                 0.033550   0.025580   1.312 0.189681    
regionCalifornia                      -0.243314   0.025580  -9.512  < 2e-16 ***
regionCharlotte                       -0.073669   0.025580  -2.880 0.003983 ** 
regionChicago                          0.020592   0.025580   0.805 0.420838    
regionCincinnatiDayton                -0.333254   0.025580 -13.028  < 2e-16 ***
regionColumbus                        -0.282485   0.025580 -11.043  < 2e-16 ***
regionDallasFtWorth                   -0.502308   0.025580 -19.637  < 2e-16 ***
regionDenver                          -0.274793   0.025580 -10.742  < 2e-16 ***
regionDetroit                         -0.224793   0.025580  -8.788  < 2e-16 ***
regionGrandRapids                     -0.023728   0.025580  -0.928 0.353635    
regionGreatLakes                      -0.166864   0.025580  -6.523 7.07e-11 ***
regionHarrisburgScranton              -0.089941   0.025580  -3.516 0.000439 ***
regionHartfordSpringfield              0.059290   0.025580   2.318 0.020471 *  
regionHouston                         -0.523669   0.025580 -20.472  < 2e-16 ***
regionIndianapolis                    -0.203905   0.025580  -7.971 1.66e-15 ***
regionJacksonville                    -0.155148   0.025580  -6.065 1.34e-09 ***
regionLasVegas                        -0.335799   0.025580 -13.127  < 2e-16 ***
regionLosAngeles                      -0.372308   0.025580 -14.555  < 2e-16 ***
regionLouisville                      -0.243432   0.025580  -9.516  < 2e-16 ***
regionMiamiFtLauderdale               -0.094438   0.025580  -3.692 0.000223 ***
regionMidsouth                        -0.141598   0.025580  -5.535 3.15e-08 ***
regionNashville                       -0.335858   0.025580 -13.130  < 2e-16 ***
regionNewOrleansMobile                -0.263491   0.025580 -10.301  < 2e-16 ***
regionNewYork                          0.053373   0.025580   2.086 0.036948 *  
regionNortheast                       -0.004320   0.025580  -0.169 0.865907    
regionNorthernNewEngland              -0.088521   0.025580  -3.461 0.000540 ***
regionOrlando                         -0.134320   0.025580  -5.251 1.53e-07 ***
regionPhiladelphia                     0.047574   0.025580   1.860 0.062930 .  
regionPhoenixTucson                   -0.620533   0.025580 -24.258  < 2e-16 ***
regionPittsburgh                      -0.098107   0.025580  -3.835 0.000126 ***
regionPlains                          -0.183254   0.025580  -7.164 8.14e-13 ***
regionPortland                        -0.302249   0.025580 -11.816  < 2e-16 ***
regionRaleighGreensboro               -0.121657   0.025580  -4.756 1.99e-06 ***
regionRichmondNorfolk                 -0.228935   0.025580  -8.950  < 2e-16 ***
regionRoanoke                         -0.252722   0.025580  -9.880  < 2e-16 ***
regionSacramento                      -0.074793   0.025580  -2.924 0.003461 ** 
regionSanDiego                        -0.287278   0.025580 -11.230  < 2e-16 ***
regionSanFrancisco                     0.048402   0.025580   1.892 0.058483 .  
regionSeattle                         -0.178994   0.025580  -6.997 2.70e-12 ***
regionSouthCarolina                   -0.202544   0.025580  -7.918 2.55e-15 ***
regionSouthCentral                    -0.479349   0.025580 -18.739  < 2e-16 ***
regionSoutheast                       -0.185740   0.025580  -7.261 4.00e-13 ***
regionSpokane                         -0.232781   0.025580  -9.100  < 2e-16 ***
regionStLouis                         -0.163018   0.025580  -6.373 1.90e-10 ***
regionSyracuse                         0.038166   0.025580   1.492 0.135716    
regionTampa                           -0.147160   0.025580  -5.753 8.91e-09 ***
regionTotalUS                         -0.256746   0.025580 -10.037  < 2e-16 ***
regionWest                            -0.363669   0.025580 -14.217  < 2e-16 ***
regionWestTexNewMexico                -0.506627   0.025580 -19.805  < 2e-16 ***
quarter2                               0.081206   0.005125  15.846  < 2e-16 ***
quarter3                               0.218901   0.005124  42.721  < 2e-16 ***
quarter4                               0.162013   0.005092  31.814  < 2e-16 ***
year2016                              -0.037010   0.004438  -8.340  < 2e-16 ***
year2017                               0.138688   0.004417  31.396  < 2e-16 ***
year2018                               0.087411   0.007895  11.071  < 2e-16 ***
typeorganic:regionAtlanta              0.113728   0.036176   3.144 0.001671 ** 
typeorganic:regionBaltimoreWashington -0.044497   0.036176  -1.230 0.218705    
typeorganic:regionBoise                0.119645   0.036176   3.307 0.000944 ***
typeorganic:regionBoston               0.028462   0.036176   0.787 0.431435    
typeorganic:regionBuffaloRochester    -0.155503   0.036176  -4.299 1.73e-05 ***
typeorganic:regionCalifornia           0.155207   0.036176   4.290 1.79e-05 ***
typeorganic:regionCharlotte            0.237337   0.036176   6.561 5.50e-11 ***
typeorganic:regionChicago             -0.049704   0.036176  -1.374 0.169471    
typeorganic:regionCincinnatiDayton    -0.037160   0.036176  -1.027 0.304341    
typeorganic:regionColumbus            -0.051538   0.036176  -1.425 0.154271    
typeorganic:regionDallasFtWorth        0.053728   0.036176   1.485 0.137512    
typeorganic:regionDenver              -0.135325   0.036176  -3.741 0.000184 ***
typeorganic:regionDetroit             -0.120296   0.036176  -3.325 0.000885 ***
typeorganic:regionGrandRapids         -0.064615   0.036176  -1.786 0.074092 .  
typeorganic:regionGreatLakes          -0.111243   0.036176  -3.075 0.002108 ** 
typeorganic:regionHarrisburgScranton   0.084379   0.036176   2.332 0.019687 *  
typeorganic:regionHartfordSpringfield  0.396627   0.036176  10.964  < 2e-16 ***
typeorganic:regionHouston              0.021124   0.036176   0.584 0.559273    
typeorganic:regionIndianapolis        -0.086272   0.036176  -2.385 0.017099 *  
typeorganic:regionJacksonville         0.210118   0.036176   5.808 6.42e-09 ***
typeorganic:regionLasVegas             0.311361   0.036176   8.607  < 2e-16 ***
typeorganic:regionLosAngeles           0.054556   0.036176   1.508 0.131550    
typeorganic:regionLouisville          -0.061834   0.036176  -1.709 0.087418 .  
typeorganic:regionMiamiFtLauderdale   -0.076213   0.036176  -2.107 0.035154 *  
typeorganic:regionMidsouth            -0.029349   0.036176  -0.811 0.417210    
typeorganic:regionNashville           -0.026154   0.036176  -0.723 0.469711    
typeorganic:regionNewOrleansMobile     0.014497   0.036176   0.401 0.688618    
typeorganic:regionNewYork              0.226331   0.036176   6.256 4.03e-10 ***
typeorganic:regionNortheast            0.090414   0.036176   2.499 0.012453 *  
typeorganic:regionNorthernNewEngland   0.009763   0.036176   0.270 0.787252    
typeorganic:regionOrlando              0.158994   0.036176   4.395 1.11e-05 ***
typeorganic:regionPhiladelphia         0.047041   0.036176   1.300 0.193496    
typeorganic:regionPhoenixTucson        0.567870   0.036176  15.697  < 2e-16 ***
typeorganic:regionPittsburgh          -0.197219   0.036176  -5.452 5.05e-08 ***
typeorganic:regionPlains               0.117456   0.036176   3.247 0.001169 ** 
typeorganic:regionPortland             0.117870   0.036176   3.258 0.001123 ** 
typeorganic:regionRaleighGreensboro    0.231479   0.036176   6.399 1.61e-10 ***
typeorganic:regionRichmondNorfolk     -0.081538   0.036176  -2.254 0.024211 *  
typeorganic:regionRoanoke             -0.120769   0.036176  -3.338 0.000844 ***
typeorganic:regionSacramento           0.270651   0.036176   7.482 7.68e-14 ***
typeorganic:regionSanDiego             0.248817   0.036176   6.878 6.27e-12 ***
typeorganic:regionSanFrancisco         0.389527   0.036176  10.768  < 2e-16 ***
typeorganic:regionSeattle              0.121065   0.036176   3.347 0.000820 ***
typeorganic:regionSouthCarolina        0.089586   0.036176   2.476 0.013281 *  
typeorganic:regionSouthCentral         0.039112   0.036176   1.081 0.279633    
typeorganic:regionSoutheast            0.045444   0.036176   1.256 0.209063    
typeorganic:regionSpokane              0.234675   0.036176   6.487 8.98e-11 ***
typeorganic:regionStLouis              0.065207   0.036176   1.803 0.071483 .  
typeorganic:regionSyracuse            -0.157751   0.036176  -4.361 1.30e-05 ***
typeorganic:regionTampa               -0.010059   0.036176  -0.278 0.780967    
typeorganic:regionTotalUS              0.029467   0.036176   0.815 0.415334    
typeorganic:regionWest                 0.149704   0.036176   4.138 3.52e-05 ***
typeorganic:regionWestTexNewMexico     0.423157   0.036257  11.671  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2351 on 18135 degrees of freedom
Multiple R-squared:  0.6611,    Adjusted R-squared:  0.659 
F-statistic: 313.1 on 113 and 18135 DF,  p-value: < 2.2e-16
model5pb <- lm(average_price ~ type + region + quarter + year + type:quarter, data = trimmed_avocados)
summary(model5pb)

Call:
lm(formula = average_price ~ type + region + quarter + year + 
    type:quarter, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.02358 -0.14643 -0.00311  0.14370  1.44227 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.180432   0.014545  81.158  < 2e-16 ***
typeorganic                0.469434   0.006682  70.256  < 2e-16 ***
regionAtlanta             -0.223077   0.019073 -11.696  < 2e-16 ***
regionBaltimoreWashington -0.026805   0.019073  -1.405 0.159924    
regionBoise               -0.212899   0.019073 -11.162  < 2e-16 ***
regionBoston              -0.030148   0.019073  -1.581 0.113971    
regionBuffaloRochester    -0.044201   0.019073  -2.317 0.020488 *  
regionCalifornia          -0.165710   0.019073  -8.688  < 2e-16 ***
regionCharlotte            0.045000   0.019073   2.359 0.018316 *  
regionChicago             -0.004260   0.019073  -0.223 0.823248    
regionCincinnatiDayton    -0.351834   0.019073 -18.447  < 2e-16 ***
regionColumbus            -0.308254   0.019073 -16.162  < 2e-16 ***
regionDallasFtWorth       -0.475444   0.019073 -24.928  < 2e-16 ***
regionDenver              -0.342456   0.019073 -17.955  < 2e-16 ***
regionDetroit             -0.284941   0.019073 -14.940  < 2e-16 ***
regionGrandRapids         -0.056036   0.019073  -2.938 0.003308 ** 
regionGreatLakes          -0.222485   0.019073 -11.665  < 2e-16 ***
regionHarrisburgScranton  -0.047751   0.019073  -2.504 0.012301 *  
regionHartfordSpringfield  0.257604   0.019073  13.506  < 2e-16 ***
regionHouston             -0.513107   0.019073 -26.902  < 2e-16 ***
regionIndianapolis        -0.247041   0.019073 -12.953  < 2e-16 ***
regionJacksonville        -0.050089   0.019073  -2.626 0.008642 ** 
regionLasVegas            -0.180118   0.019073  -9.444  < 2e-16 ***
regionLosAngeles          -0.345030   0.019073 -18.090  < 2e-16 ***
regionLouisville          -0.274349   0.019073 -14.384  < 2e-16 ***
regionMiamiFtLauderdale   -0.132544   0.019073  -6.949 3.79e-12 ***
regionMidsouth            -0.156272   0.019073  -8.193 2.71e-16 ***
regionNashville           -0.348935   0.019073 -18.295  < 2e-16 ***
regionNewOrleansMobile    -0.256243   0.019073 -13.435  < 2e-16 ***
regionNewYork              0.166538   0.019073   8.732  < 2e-16 ***
regionNortheast            0.040888   0.019073   2.144 0.032066 *  
regionNorthernNewEngland  -0.083639   0.019073  -4.385 1.17e-05 ***
regionOrlando             -0.054822   0.019073  -2.874 0.004053 ** 
regionPhiladelphia         0.071095   0.019073   3.728 0.000194 ***
regionPhoenixTucson       -0.336598   0.019073 -17.648  < 2e-16 ***
regionPittsburgh          -0.196716   0.019073 -10.314  < 2e-16 ***
regionPlains              -0.124527   0.019073  -6.529 6.80e-11 ***
regionPortland            -0.243314   0.019073 -12.757  < 2e-16 ***
regionRaleighGreensboro   -0.005917   0.019073  -0.310 0.756382    
regionRichmondNorfolk     -0.269704   0.019073 -14.141  < 2e-16 ***
regionRoanoke             -0.313107   0.019073 -16.416  < 2e-16 ***
regionSacramento           0.060533   0.019073   3.174 0.001507 ** 
regionSanDiego            -0.162870   0.019073  -8.539  < 2e-16 ***
regionSanFrancisco         0.243166   0.019073  12.749  < 2e-16 ***
regionSeattle             -0.118462   0.019073  -6.211 5.38e-10 ***
regionSouthCarolina       -0.157751   0.019073  -8.271  < 2e-16 ***
regionSouthCentral        -0.459793   0.019073 -24.107  < 2e-16 ***
regionSoutheast           -0.163018   0.019073  -8.547  < 2e-16 ***
regionSpokane             -0.115444   0.019073  -6.053 1.45e-09 ***
regionStLouis             -0.130414   0.019073  -6.838 8.30e-12 ***
regionSyracuse            -0.040710   0.019073  -2.134 0.032819 *  
regionTampa               -0.152189   0.019073  -7.979 1.56e-15 ***
regionTotalUS             -0.242012   0.019073 -12.689  < 2e-16 ***
regionWest                -0.288817   0.019073 -15.143  < 2e-16 ***
regionWestTexNewMexico    -0.296626   0.019116 -15.518  < 2e-16 ***
quarter2                   0.066217   0.007413   8.933  < 2e-16 ***
quarter3                   0.186137   0.007413  25.110  < 2e-16 ***
quarter4                   0.152474   0.007364  20.706  < 2e-16 ***
year2016                  -0.036977   0.004679  -7.902 2.89e-15 ***
year2017                   0.138659   0.004658  29.768  < 2e-16 ***
year2018                   0.087412   0.008325  10.500  < 2e-16 ***
typeorganic:quarter2       0.029809   0.010152   2.936 0.003325 ** 
typeorganic:quarter3       0.065528   0.010150   6.456 1.10e-10 ***
typeorganic:quarter4       0.018995   0.010079   1.885 0.059501 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2479 on 18185 degrees of freedom
Multiple R-squared:  0.6222,    Adjusted R-squared:  0.6209 
F-statistic: 475.3 on 63 and 18185 DF,  p-value: < 2.2e-16
model5pc <- lm(average_price ~ type + region + quarter + year + type:year, data = trimmed_avocados)
summary(model5pc)

Call:
lm(formula = average_price ~ type + region + quarter + year + 
    type:year, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.00911 -0.14461 -0.00436  0.13900  1.46703 

Coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.117496   0.014421  77.493  < 2e-16 ***
typeorganic                0.595327   0.006565  90.688  < 2e-16 ***
regionAtlanta             -0.223077   0.018919 -11.791  < 2e-16 ***
regionBaltimoreWashington -0.026805   0.018919  -1.417 0.156565    
regionBoise               -0.212899   0.018919 -11.253  < 2e-16 ***
regionBoston              -0.030148   0.018919  -1.593 0.111069    
regionBuffaloRochester    -0.044201   0.018919  -2.336 0.019488 *  
regionCalifornia          -0.165710   0.018919  -8.759  < 2e-16 ***
regionCharlotte            0.045000   0.018919   2.379 0.017393 *  
regionChicago             -0.004260   0.018919  -0.225 0.821839    
regionCincinnatiDayton    -0.351834   0.018919 -18.596  < 2e-16 ***
regionColumbus            -0.308254   0.018919 -16.293  < 2e-16 ***
regionDallasFtWorth       -0.475444   0.018919 -25.130  < 2e-16 ***
regionDenver              -0.342456   0.018919 -18.101  < 2e-16 ***
regionDetroit             -0.284941   0.018919 -15.061  < 2e-16 ***
regionGrandRapids         -0.056036   0.018919  -2.962 0.003063 ** 
regionGreatLakes          -0.222485   0.018919 -11.760  < 2e-16 ***
regionHarrisburgScranton  -0.047751   0.018919  -2.524 0.011613 *  
regionHartfordSpringfield  0.257604   0.018919  13.616  < 2e-16 ***
regionHouston             -0.513107   0.018919 -27.121  < 2e-16 ***
regionIndianapolis        -0.247041   0.018919 -13.058  < 2e-16 ***
regionJacksonville        -0.050089   0.018919  -2.647 0.008117 ** 
regionLasVegas            -0.180118   0.018919  -9.520  < 2e-16 ***
regionLosAngeles          -0.345030   0.018919 -18.237  < 2e-16 ***
regionLouisville          -0.274349   0.018919 -14.501  < 2e-16 ***
regionMiamiFtLauderdale   -0.132544   0.018919  -7.006 2.54e-12 ***
regionMidsouth            -0.156272   0.018919  -8.260  < 2e-16 ***
regionNashville           -0.348935   0.018919 -18.443  < 2e-16 ***
regionNewOrleansMobile    -0.256243   0.018919 -13.544  < 2e-16 ***
regionNewYork              0.166538   0.018919   8.802  < 2e-16 ***
regionNortheast            0.040888   0.018919   2.161 0.030698 *  
regionNorthernNewEngland  -0.083639   0.018919  -4.421 9.89e-06 ***
regionOrlando             -0.054822   0.018919  -2.898 0.003764 ** 
regionPhiladelphia         0.071095   0.018919   3.758 0.000172 ***
regionPhoenixTucson       -0.336598   0.018919 -17.791  < 2e-16 ***
regionPittsburgh          -0.196716   0.018919 -10.398  < 2e-16 ***
regionPlains              -0.124527   0.018919  -6.582 4.77e-11 ***
regionPortland            -0.243314   0.018919 -12.860  < 2e-16 ***
regionRaleighGreensboro   -0.005917   0.018919  -0.313 0.754471    
regionRichmondNorfolk     -0.269704   0.018919 -14.255  < 2e-16 ***
regionRoanoke             -0.313107   0.018919 -16.549  < 2e-16 ***
regionSacramento           0.060533   0.018919   3.199 0.001379 ** 
regionSanDiego            -0.162870   0.018919  -8.609  < 2e-16 ***
regionSanFrancisco         0.243166   0.018919  12.853  < 2e-16 ***
regionSeattle             -0.118462   0.018919  -6.261 3.90e-10 ***
regionSouthCarolina       -0.157751   0.018919  -8.338  < 2e-16 ***
regionSouthCentral        -0.459793   0.018919 -24.303  < 2e-16 ***
regionSoutheast           -0.163018   0.018919  -8.616  < 2e-16 ***
regionSpokane             -0.115444   0.018919  -6.102 1.07e-09 ***
regionStLouis             -0.130414   0.018919  -6.893 5.64e-12 ***
regionSyracuse            -0.040710   0.018919  -2.152 0.031430 *  
regionTampa               -0.152189   0.018919  -8.044 9.22e-16 ***
regionTotalUS             -0.242012   0.018919 -12.792  < 2e-16 ***
regionWest                -0.288817   0.018919 -15.266  < 2e-16 ***
regionWestTexNewMexico    -0.296641   0.018962 -15.644  < 2e-16 ***
quarter2                   0.081108   0.005360  15.132  < 2e-16 ***
quarter3                   0.218901   0.005359  40.844  < 2e-16 ***
quarter4                   0.161984   0.005327  30.410  < 2e-16 ***
year2016                   0.027632   0.006564   4.210 2.57e-05 ***
year2017                   0.216048   0.006533  33.069  < 2e-16 ***
year2018                   0.165421   0.011209  14.758  < 2e-16 ***
typeorganic:year2016      -0.129237   0.009283 -13.921  < 2e-16 ***
typeorganic:year2017      -0.154818   0.009240 -16.755  < 2e-16 ***
typeorganic:year2018      -0.156037   0.015159 -10.293  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.246 on 18185 degrees of freedom
Multiple R-squared:  0.6282,    Adjusted R-squared:  0.6269 
F-statistic: 487.7 on 63 and 18185 DF,  p-value: < 2.2e-16
model5pd <- lm(average_price ~ type + region + quarter + year + region:quarter, data = trimmed_avocados)
summary(model5pd)

Call:
lm(formula = average_price ~ type + region + quarter + year + 
    region:quarter, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.06598 -0.14588  0.00059  0.14115  1.38051 

Coefficients:
                                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)                         1.216463   0.024241  50.182  < 2e-16 ***
typeorganic                         0.495917   0.003583 138.408  < 2e-16 ***
regionAtlanta                      -0.257647   0.033888  -7.603 3.04e-14 ***
regionBaltimoreWashington          -0.089804   0.033888  -2.650 0.008056 ** 
regionBoise                        -0.285392   0.033888  -8.422  < 2e-16 ***
regionBoston                       -0.007059   0.033888  -0.208 0.835000    
regionBuffaloRochester             -0.031078   0.033888  -0.917 0.359111    
regionCalifornia                   -0.279706   0.033888  -8.254  < 2e-16 ***
regionCharlotte                    -0.021471   0.033888  -0.634 0.526370    
regionChicago                      -0.073627   0.033888  -2.173 0.029820 *  
regionCincinnatiDayton             -0.434902   0.033888 -12.833  < 2e-16 ***
regionColumbus                     -0.324804   0.033888  -9.585  < 2e-16 ***
regionDallasFtWorth                -0.484510   0.033888 -14.297  < 2e-16 ***
regionDenver                       -0.421569   0.033888 -12.440  < 2e-16 ***
regionDetroit                      -0.305000   0.033888  -9.000  < 2e-16 ***
regionGrandRapids                  -0.128235   0.033888  -3.784 0.000155 ***
regionGreatLakes                   -0.268137   0.033888  -7.912 2.67e-15 ***
regionHarrisburgScranton           -0.060000   0.033888  -1.771 0.076657 .  
regionHartfordSpringfield           0.229020   0.033888   6.758 1.44e-11 ***
regionHouston                      -0.537059   0.033888 -15.848  < 2e-16 ***
regionIndianapolis                 -0.273824   0.033888  -8.080 6.87e-16 ***
regionJacksonville                 -0.110392   0.033888  -3.258 0.001126 ** 
regionLasVegas                     -0.290686   0.033888  -8.578  < 2e-16 ***
regionLosAngeles                   -0.433039   0.033888 -12.778  < 2e-16 ***
regionLouisville                   -0.295490   0.033888  -8.720  < 2e-16 ***
regionMiamiFtLauderdale            -0.111863   0.033888  -3.301 0.000966 ***
regionMidsouth                     -0.194510   0.033888  -5.740 9.64e-09 ***
regionNashville                    -0.351275   0.033888 -10.366  < 2e-16 ***
regionNewOrleansMobile             -0.317255   0.033888  -9.362  < 2e-16 ***
regionNewYork                       0.105098   0.033888   3.101 0.001930 ** 
regionNortheast                     0.020000   0.033888   0.590 0.555082    
regionNorthernNewEngland           -0.059804   0.033888  -1.765 0.077625 .  
regionOrlando                      -0.103431   0.033888  -3.052 0.002276 ** 
regionPhiladelphia                  0.016569   0.033888   0.489 0.624905    
regionPhoenixTucson                -0.445294   0.033888 -13.140  < 2e-16 ***
regionPittsburgh                   -0.174510   0.033888  -5.150 2.64e-07 ***
regionPlains                       -0.184412   0.033888  -5.442 5.34e-08 ***
regionPortland                     -0.353235   0.033888 -10.424  < 2e-16 ***
regionRaleighGreensboro            -0.058039   0.033888  -1.713 0.086792 .  
regionRichmondNorfolk              -0.263627   0.033888  -7.779 7.68e-15 ***
regionRoanoke                      -0.312255   0.033888  -9.214  < 2e-16 ***
regionSacramento                   -0.027059   0.033888  -0.798 0.424608    
regionSanDiego                     -0.286667   0.033888  -8.459  < 2e-16 ***
regionSanFrancisco                  0.090588   0.033888   2.673 0.007521 ** 
regionSeattle                      -0.258824   0.033888  -7.638 2.32e-14 ***
regionSouthCarolina                -0.206961   0.033888  -6.107 1.04e-09 ***
regionSouthCentral                 -0.475686   0.033888 -14.037  < 2e-16 ***
regionSoutheast                    -0.207255   0.033888  -6.116 9.80e-10 ***
regionSpokane                      -0.269608   0.033888  -7.956 1.88e-15 ***
regionStLouis                      -0.190980   0.033888  -5.636 1.77e-08 ***
regionSyracuse                     -0.027647   0.033888  -0.816 0.414609    
regionTampa                        -0.153235   0.033888  -4.522 6.17e-06 ***
regionTotalUS                      -0.290392   0.033888  -8.569  < 2e-16 ***
regionWest                         -0.389020   0.033888 -11.479  < 2e-16 ***
regionWestTexNewMexico             -0.365980   0.033888 -10.800  < 2e-16 ***
quarter2                            0.085685   0.036447   2.351 0.018736 *  
quarter3                            0.093249   0.036447   2.558 0.010521 *  
quarter4                            0.071967   0.036188   1.989 0.046752 *  
year2016                           -0.036996   0.004567  -8.100 5.83e-16 ***
year2017                            0.138600   0.004546  30.485  < 2e-16 ***
year2018                            0.087387   0.008126  10.754  < 2e-16 ***
regionAtlanta:quarter2             -0.088379   0.051480  -1.717 0.086041 .  
regionBaltimoreWashington:quarter2  0.092368   0.051480   1.794 0.072790 .  
regionBoise:quarter2               -0.095505   0.051480  -1.855 0.063585 .  
regionBoston:quarter2               0.011418   0.051480   0.222 0.824479    
regionBuffaloRochester:quarter2     0.081719   0.051480   1.587 0.112440    
regionCalifornia:quarter2           0.003552   0.051480   0.069 0.944992    
regionCharlotte:quarter2            0.062240   0.051480   1.209 0.226676    
regionChicago:quarter2             -0.004193   0.051480  -0.081 0.935085    
regionCincinnatiDayton:quarter2     0.010030   0.051480   0.195 0.845524    
regionColumbus:quarter2            -0.094042   0.051480  -1.827 0.067751 .  
regionDallasFtWorth:quarter2       -0.078439   0.051480  -1.524 0.127607    
regionDenver:quarter2              -0.015739   0.051480  -0.306 0.759813    
regionDetroit:quarter2             -0.036923   0.051480  -0.717 0.473241    
regionGrandRapids:quarter2          0.135799   0.051480   2.638 0.008349 ** 
regionGreatLakes:quarter2          -0.011478   0.051480  -0.223 0.823567    
regionHarrisburgScranton:quarter2   0.065513   0.051480   1.273 0.203181    
regionHartfordSpringfield:quarter2  0.067262   0.051480   1.307 0.191375    
regionHouston:quarter2             -0.089223   0.051480  -1.733 0.083084 .  
regionIndianapolis:quarter2        -0.064253   0.051480  -1.248 0.212003    
regionJacksonville:quarter2         0.028213   0.051480   0.548 0.583677    
regionLasVegas:quarter2            -0.074314   0.051480  -1.444 0.148885    
regionLosAngeles:quarter2          -0.060679   0.051480  -1.179 0.238540    
regionLouisville:quarter2          -0.074510   0.051480  -1.447 0.147816    
regionMiamiFtLauderdale:quarter2   -0.009676   0.051480  -0.188 0.850917    
regionMidsouth:quarter2            -0.013952   0.051480  -0.271 0.786385    
regionNashville:quarter2           -0.102572   0.051480  -1.992 0.046336 *  
regionNewOrleansMobile:quarter2     0.083793   0.051480   1.628 0.103609    
regionNewYork:quarter2              0.087722   0.051480   1.704 0.088397 .  
regionNortheast:quarter2            0.056410   0.051480   1.096 0.273195    
regionNorthernNewEngland:quarter2  -0.067632   0.051480  -1.314 0.188947    
regionOrlando:quarter2              0.018047   0.051480   0.351 0.725924    
regionPhiladelphia:quarter2         0.109970   0.051480   2.136 0.032680 *  
regionPhoenixTucson:quarter2       -0.020090   0.051480  -0.390 0.696351    
regionPittsburgh:quarter2          -0.038054   0.051480  -0.739 0.459792    
regionPlains:quarter2              -0.002896   0.051480  -0.056 0.955141    
regionPortland:quarter2            -0.045354   0.051480  -0.881 0.378324    
regionRaleighGreensboro:quarter2    0.001885   0.051480   0.037 0.970786    
regionRichmondNorfolk:quarter2     -0.113552   0.051480  -2.206 0.027414 *  
regionRoanoke:quarter2             -0.131207   0.051480  -2.549 0.010821 *  
regionSacramento:quarter2           0.084238   0.051480   1.636 0.101788    
regionSanDiego:quarter2            -0.003333   0.051480  -0.065 0.948374    
regionSanFrancisco:quarter2         0.121976   0.051480   2.369 0.017828 *  
regionSeattle:quarter2              0.012029   0.051480   0.234 0.815254    
regionSouthCarolina:quarter2        0.027602   0.051480   0.536 0.591851    
regionSouthCentral:quarter2        -0.072262   0.051480  -1.404 0.160426    
regionSoutheast:quarter2           -0.005950   0.051480  -0.116 0.907984    
regionSpokane:quarter2              0.009736   0.051480   0.189 0.849999    
regionStLouis:quarter2              0.057006   0.051480   1.107 0.268161    
regionSyracuse:quarter2             0.064955   0.051480   1.262 0.207057    
regionTampa:quarter2                0.006056   0.051480   0.118 0.906359    
regionTotalUS:quarter2             -0.009223   0.051480  -0.179 0.857813    
regionWest:quarter2                -0.029186   0.051480  -0.567 0.570770    
regionWestTexNewMexico:quarter2    -0.096213   0.051672  -1.862 0.062620 .  
regionAtlanta:quarter3              0.122391   0.051480   2.377 0.017444 *  
regionBaltimoreWashington:quarter3  0.095830   0.051480   1.861 0.062691 .  
regionBoise:quarter3                0.251931   0.051480   4.894 9.98e-07 ***
regionBoston:quarter3              -0.001146   0.051480  -0.022 0.982235    
regionBuffaloRochester:quarter3    -0.034050   0.051480  -0.661 0.508354    
regionCalifornia:quarter3           0.255860   0.051480   4.970 6.75e-07 ***
regionCharlotte:quarter3            0.139804   0.051480   2.716 0.006620 ** 
regionChicago:quarter3              0.174012   0.051480   3.380 0.000726 ***
regionCincinnatiDayton:quarter3     0.212594   0.051480   4.130 3.65e-05 ***
regionColumbus:quarter3             0.109291   0.051480   2.123 0.033769 *  
regionDallasFtWorth:quarter3        0.023228   0.051480   0.451 0.651852    
regionDenver:quarter3               0.212466   0.051480   4.127 3.69e-05 ***
regionDetroit:quarter3              0.054872   0.051480   1.066 0.286490    
regionGrandRapids:quarter3          0.091440   0.051480   1.776 0.075712 .  
regionGreatLakes:quarter3           0.123522   0.051480   2.399 0.016432 *  
regionHarrisburgScranton:quarter3   0.006795   0.051480   0.132 0.894993    
regionHartfordSpringfield:quarter3  0.049442   0.051480   0.960 0.336862    
regionHouston:quarter3              0.072059   0.051480   1.400 0.161608    
regionIndianapolis:quarter3         0.092157   0.051480   1.790 0.073447 .  
regionJacksonville:quarter3         0.168213   0.051480   3.268 0.001087 ** 
regionLasVegas:quarter3             0.295302   0.051480   5.736 9.84e-09 ***
regionLosAngeles:quarter3           0.214578   0.051480   4.168 3.08e-05 ***
regionLouisville:quarter3           0.084721   0.051480   1.646 0.099842 .  
regionMiamiFtLauderdale:quarter3   -0.072240   0.051480  -1.403 0.160557    
regionMidsouth:quarter3             0.095407   0.051480   1.853 0.063858 .  
regionNashville:quarter3            0.041531   0.051480   0.807 0.419828    
regionNewOrleansMobile:quarter3     0.071357   0.051480   1.386 0.165728    
regionNewYork:quarter3              0.112338   0.051480   2.182 0.029110 *  
regionNortheast:quarter3            0.050256   0.051480   0.976 0.328963    
regionNorthernNewEngland:quarter3  -0.013658   0.051480  -0.265 0.790782    
regionOrlando:quarter3              0.116252   0.051480   2.258 0.023946 *  
regionPhiladelphia:quarter3         0.082149   0.051480   1.596 0.110562    
regionPhoenixTucson:quarter3        0.260038   0.051480   5.051 4.43e-07 ***
regionPittsburgh:quarter3          -0.016131   0.051480  -0.313 0.754019    
regionPlains:quarter3               0.136335   0.051480   2.648 0.008097 ** 
regionPortland:quarter3             0.334261   0.051480   6.493 8.63e-11 ***
regionRaleighGreensboro:quarter3    0.121373   0.051480   2.358 0.018401 *  
regionRichmondNorfolk:quarter3      0.051576   0.051480   1.002 0.316421    
regionRoanoke:quarter3              0.090460   0.051480   1.757 0.078903 .  
regionSacramento:quarter3           0.181161   0.051480   3.519 0.000434 ***
regionSanDiego:quarter3             0.280385   0.051480   5.446 5.21e-08 ***
regionSanFrancisco:quarter3         0.312360   0.051480   6.068 1.32e-09 ***
regionSeattle:quarter3              0.392029   0.051480   7.615 2.76e-14 ***
regionSouthCarolina:quarter3        0.102345   0.051480   1.988 0.046820 *  
regionSouthCentral:quarter3         0.042609   0.051480   0.828 0.407859    
regionSoutheast:quarter3            0.111357   0.051480   2.163 0.030545 *  
regionSpokane:quarter3              0.393582   0.051480   7.645 2.19e-14 ***
regionStLouis:quarter3              0.192134   0.051480   3.732 0.000190 ***
regionSyracuse:quarter3            -0.036840   0.051480  -0.716 0.474236    
regionTampa:quarter3               -0.043047   0.051480  -0.836 0.403063    
regionTotalUS:quarter3              0.104751   0.051480   2.035 0.041887 *  
regionWest:quarter3                 0.297609   0.051480   5.781 7.55e-09 ***
regionWestTexNewMexico:quarter3     0.178160   0.051480   3.461 0.000540 ***
regionAtlanta:quarter4              0.112897   0.051114   2.209 0.027206 *  
regionBaltimoreWashington:quarter4  0.082679   0.051114   1.618 0.105780    
regionBoise:quarter4                0.153767   0.051114   3.008 0.002631 ** 
regionBoston:quarter4              -0.107566   0.051114  -2.104 0.035355 *  
regionBuffaloRochester:quarter4    -0.101922   0.051114  -1.994 0.046167 *  
regionCalifornia:quarter4           0.228706   0.051114   4.474 7.71e-06 ***
regionCharlotte:quarter4            0.083846   0.051114   1.640 0.100948    
regionChicago:quarter4              0.127502   0.051114   2.494 0.012624 *  
regionCincinnatiDayton:quarter4     0.133902   0.051114   2.620 0.008809 ** 
regionColumbus:quarter4             0.055054   0.051114   1.077 0.281460    
regionDallasFtWorth:quarter4        0.092135   0.051114   1.803 0.071479 .  
regionDenver:quarter4               0.142444   0.051114   2.787 0.005329 ** 
regionDetroit:quarter4              0.067250   0.051114   1.316 0.188297    
regionGrandRapids:quarter4          0.083485   0.051114   1.633 0.102421    
regionGreatLakes:quarter4           0.083637   0.051114   1.636 0.101797    
regionHarrisburgScranton:quarter4  -0.018750   0.051114  -0.367 0.713753    
regionHartfordSpringfield:quarter4  0.006980   0.051114   0.137 0.891376    
regionHouston:quarter4              0.117934   0.051114   2.307 0.021051 *  
regionIndianapolis:quarter4         0.085949   0.051114   1.682 0.092683 .  
regionJacksonville:quarter4         0.063267   0.051114   1.238 0.215820    
regionLasVegas:quarter4             0.251686   0.051114   4.924 8.55e-07 ***
regionLosAngeles:quarter4           0.221789   0.051114   4.339 1.44e-05 ***
regionLouisville:quarter4           0.079365   0.051114   1.553 0.120511    
regionMiamiFtLauderdale:quarter4   -0.007512   0.051114  -0.147 0.883157    
regionMidsouth:quarter4             0.082135   0.051114   1.607 0.108096    
regionNashville:quarter4            0.069400   0.051114   1.358 0.174564    
regionNewOrleansMobile:quarter4     0.106505   0.051114   2.084 0.037204 *  
regionNewYork:quarter4              0.064527   0.051114   1.262 0.206818    
regionNortheast:quarter4           -0.015750   0.051114  -0.308 0.757984    
regionNorthernNewEngland:quarter4  -0.021446   0.051114  -0.420 0.674803    
regionOrlando:quarter4              0.074431   0.051114   1.456 0.145360    
regionPhiladelphia:quarter4         0.043056   0.051114   0.842 0.399599    
regionPhoenixTucson:quarter4        0.225294   0.051114   4.408 1.05e-05 ***
 [ reached getOption("max.print") -- omitted 20 rows ]
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.242 on 18029 degrees of freedom
Multiple R-squared:  0.6431,    Adjusted R-squared:  0.6388 
F-statistic: 148.4 on 219 and 18029 DF,  p-value: < 2.2e-16
model5pe <- lm(average_price ~ type + region + quarter + year + region:year, data = trimmed_avocados)
summary(model5pe)

Call:
lm(formula = average_price ~ type + region + quarter + year + 
    region:year, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.03093 -0.14190 -0.00143  0.13797  1.38892 

Coefficients:
                                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)                         1.175e+00  2.396e-02  49.047  < 2e-16 ***
typeorganic                         4.959e-01  3.575e-03 138.719  < 2e-16 ***
regionAtlanta                      -1.582e-01  3.349e-02  -4.724 2.33e-06 ***
regionBaltimoreWashington          -1.699e-01  3.349e-02  -5.074 3.94e-07 ***
regionBoise                        -1.650e-01  3.349e-02  -4.927 8.40e-07 ***
regionBoston                       -6.519e-02  3.349e-02  -1.947 0.051566 .  
regionBuffaloRochester              5.865e-03  3.349e-02   0.175 0.860955    
regionCalifornia                   -2.229e-01  3.349e-02  -6.656 2.89e-11 ***
regionCharlotte                     3.702e-02  3.349e-02   1.106 0.268948    
regionChicago                      -1.347e-01  3.349e-02  -4.023 5.77e-05 ***
regionCincinnatiDayton             -3.364e-01  3.349e-02 -10.047  < 2e-16 ***
regionColumbus                     -2.649e-01  3.349e-02  -7.911 2.70e-15 ***
regionDallasFtWorth                -4.609e-01  3.349e-02 -13.763  < 2e-16 ***
regionDenver                       -3.510e-01  3.349e-02 -10.481  < 2e-16 ***
regionDetroit                      -2.005e-01  3.349e-02  -5.987 2.18e-09 ***
regionGrandRapids                  -1.224e-01  3.349e-02  -3.655 0.000258 ***
regionGreatLakes                   -2.125e-01  3.349e-02  -6.346 2.26e-10 ***
regionHarrisburgScranton           -6.712e-02  3.349e-02  -2.004 0.045053 *  
regionHartfordSpringfield           2.090e-01  3.349e-02   6.243 4.40e-10 ***
regionHouston                      -4.907e-01  3.349e-02 -14.653  < 2e-16 ***
regionIndianapolis                 -1.958e-01  3.349e-02  -5.846 5.11e-09 ***
regionJacksonville                 -3.567e-02  3.349e-02  -1.065 0.286744    
regionLasVegas                     -1.699e-01  3.349e-02  -5.074 3.94e-07 ***
regionLosAngeles                   -3.862e-01  3.349e-02 -11.535  < 2e-16 ***
regionLouisville                   -2.443e-01  3.349e-02  -7.296 3.08e-13 ***
regionMiamiFtLauderdale            -1.552e-01  3.349e-02  -4.635 3.60e-06 ***
regionMidsouth                     -1.874e-01  3.349e-02  -5.597 2.22e-08 ***
regionNashville                    -2.615e-01  3.349e-02  -7.810 6.01e-15 ***
regionNewOrleansMobile             -2.711e-01  3.349e-02  -8.095 6.10e-16 ***
regionNewYork                       1.058e-01  3.349e-02   3.159 0.001588 ** 
regionNortheast                     5.000e-03  3.349e-02   0.149 0.881305    
regionNorthernNewEngland           -6.538e-02  3.349e-02  -1.953 0.050881 .  
regionOrlando                      -3.942e-02  3.349e-02  -1.177 0.239087    
regionPhiladelphia                  1.644e-02  3.349e-02   0.491 0.623415    
regionPhoenixTucson                -3.816e-01  3.349e-02 -11.397  < 2e-16 ***
regionPittsburgh                   -1.315e-01  3.349e-02  -3.928 8.59e-05 ***
regionPlains                       -1.009e-01  3.349e-02  -3.012 0.002597 ** 
regionPortland                     -2.319e-01  3.349e-02  -6.926 4.47e-12 ***
regionRaleighGreensboro            -8.933e-02  3.349e-02  -2.668 0.007646 ** 
regionRichmondNorfolk              -2.642e-01  3.349e-02  -7.891 3.17e-15 ***
regionRoanoke                      -3.116e-01  3.349e-02  -9.306  < 2e-16 ***
regionSacramento                   -8.471e-02  3.349e-02  -2.530 0.011422 *  
regionSanDiego                     -2.645e-01  3.349e-02  -7.899 2.96e-15 ***
regionSanFrancisco                  8.231e-02  3.349e-02   2.458 0.013981 *  
regionSeattle                      -1.165e-01  3.349e-02  -3.480 0.000502 ***
regionSouthCarolina                -8.404e-02  3.349e-02  -2.510 0.012093 *  
regionSouthCentral                 -4.267e-01  3.349e-02 -12.744  < 2e-16 ***
regionSoutheast                    -1.240e-01  3.349e-02  -3.704 0.000213 ***
regionSpokane                      -1.384e-01  3.349e-02  -4.132 3.61e-05 ***
regionStLouis                      -3.538e-02  3.349e-02  -1.057 0.290659    
regionSyracuse                     -9.712e-03  3.349e-02  -0.290 0.771804    
regionTampa                        -1.821e-01  3.349e-02  -5.439 5.44e-08 ***
regionTotalUS                      -2.813e-01  3.349e-02  -8.402  < 2e-16 ***
regionWest                         -3.010e-01  3.349e-02  -8.988  < 2e-16 ***
regionWestTexNewMexico             -2.766e-01  3.357e-02  -8.239  < 2e-16 ***
quarter2                            8.108e-02  5.262e-03  15.407  < 2e-16 ***
quarter3                            2.189e-01  5.262e-03  41.602  < 2e-16 ***
quarter4                            1.620e-01  5.229e-03  30.974  < 2e-16 ***
year2016                           -4.808e-03  3.349e-02  -0.144 0.885838    
year2017                            9.820e-02  3.333e-02   2.947 0.003217 ** 
year2018                            1.257e-02  5.478e-02   0.230 0.818454    
regionAtlanta:year2016             -1.616e-01  4.736e-02  -3.413 0.000643 ***
regionBaltimoreWashington:year2016  2.236e-01  4.736e-02   4.721 2.37e-06 ***
regionBoise:year2016               -2.270e-01  4.736e-02  -4.794 1.65e-06 ***
regionBoston:year2016              -4.260e-02  4.736e-02  -0.899 0.368404    
regionBuffaloRochester:year2016    -5.596e-02  4.736e-02  -1.182 0.237332    
regionCalifornia:year2016           1.885e-02  4.736e-02   0.398 0.690658    
regionCharlotte:year2016           -7.308e-02  4.736e-02  -1.543 0.122814    
regionChicago:year2016              1.481e-01  4.736e-02   3.127 0.001769 ** 
regionCincinnatiDayton:year2016    -1.091e-01  4.736e-02  -2.305 0.021203 *  
regionColumbus:year2016            -8.269e-02  4.736e-02  -1.746 0.080796 .  
regionDallasFtWorth:year2016       -7.692e-02  4.736e-02  -1.624 0.104317    
regionDenver:year2016              -8.981e-02  4.736e-02  -1.896 0.057918 .  
regionDetroit:year2016             -1.611e-01  4.736e-02  -3.401 0.000673 ***
regionGrandRapids:year2016          9.779e-02  4.736e-02   2.065 0.038940 *  
regionGreatLakes:year2016          -4.442e-02  4.736e-02  -0.938 0.348222    
regionHarrisburgScranton:year2016   4.481e-02  4.736e-02   0.946 0.344065    
regionHartfordSpringfield:year2016  1.081e-01  4.736e-02   2.282 0.022488 *  
regionHouston:year2016             -5.135e-02  4.736e-02  -1.084 0.278264    
regionIndianapolis:year2016        -3.663e-02  4.736e-02  -0.774 0.439177    
regionJacksonville:year2016        -1.306e-01  4.736e-02  -2.757 0.005833 ** 
regionLasVegas:year2016            -1.163e-02  4.736e-02  -0.246 0.805929    
regionLosAngeles:year2016          -6.394e-02  4.736e-02  -1.350 0.176953    
regionLouisville:year2016          -7.808e-02  4.736e-02  -1.649 0.099221 .  
regionMiamiFtLauderdale:year2016   -9.894e-02  4.736e-02  -2.089 0.036692 *  
regionMidsouth:year2016             4.327e-03  4.736e-02   0.091 0.927199    
regionNashville:year2016           -1.562e-01  4.736e-02  -3.299 0.000971 ***
regionNewOrleansMobile:year2016    -1.423e-02  4.736e-02  -0.301 0.763794    
regionNewYork:year2016              1.223e-01  4.736e-02   2.583 0.009810 ** 
regionNortheast:year2016            5.673e-02  4.736e-02   1.198 0.230946    
regionNorthernNewEngland:year2016  -7.587e-02  4.736e-02  -1.602 0.109168    
regionOrlando:year2016             -1.237e-01  4.736e-02  -2.613 0.008978 ** 
regionPhiladelphia:year2016         1.244e-01  4.736e-02   2.627 0.008611 ** 
regionPhoenixTucson:year2016        1.064e-01  4.736e-02   2.248 0.024607 *  
regionPittsburgh:year2016          -5.904e-02  4.736e-02  -1.247 0.212525    
regionPlains:year2016              -5.558e-02  4.736e-02  -1.174 0.240571    
regionPortland:year2016            -1.104e-01  4.736e-02  -2.331 0.019767 *  
regionRaleighGreensboro:year2016    3.173e-03  4.736e-02   0.067 0.946579    
regionRichmondNorfolk:year2016     -5.856e-02  4.736e-02  -1.237 0.216273    
regionRoanoke:year2016             -7.481e-02  4.736e-02  -1.580 0.114195    
regionSacramento:year2016           2.189e-01  4.736e-02   4.623 3.80e-06 ***
regionSanDiego:year2016             4.433e-02  4.736e-02   0.936 0.349267    
regionSanFrancisco:year2016         2.650e-01  4.736e-02   5.596 2.23e-08 ***
regionSeattle:year2016             -1.171e-01  4.736e-02  -2.473 0.013404 *  
regionSouthCarolina:year2016       -1.449e-01  4.736e-02  -3.060 0.002217 ** 
regionSouthCentral:year2016        -8.029e-02  4.736e-02  -1.695 0.090012 .  
regionSoutheast:year2016           -1.230e-01  4.736e-02  -2.597 0.009413 ** 
regionSpokane:year2016             -6.202e-02  4.736e-02  -1.310 0.190334    
regionStLouis:year2016             -3.131e-01  4.736e-02  -6.611 3.92e-11 ***
regionSyracuse:year2016            -2.077e-02  4.736e-02  -0.439 0.660973    
regionTampa:year2016               -8.731e-02  4.736e-02  -1.844 0.065251 .  
regionTotalUS:year2016              1.096e-02  4.736e-02   0.231 0.816951    
regionWest:year2016                -5.212e-02  4.736e-02  -1.101 0.271127    
regionWestTexNewMexico:year2016    -1.074e-02  4.741e-02  -0.226 0.820854    
regionAtlanta:year2017             -5.088e-02  4.713e-02  -1.080 0.280337    
regionBaltimoreWashington:year2017  2.115e-01  4.713e-02   4.488 7.25e-06 ***
regionBoise:year2017                1.981e-02  4.713e-02   0.420 0.674245    
regionBoston:year2017               1.069e-01  4.713e-02   2.268 0.023347 *  
regionBuffaloRochester:year2017    -5.596e-02  4.713e-02  -1.187 0.235126    
regionCalifornia:year2017           1.189e-01  4.713e-02   2.523 0.011639 *  
regionCharlotte:year2017            9.496e-02  4.713e-02   2.015 0.043940 *  
regionChicago:year2017              2.117e-01  4.713e-02   4.491 7.12e-06 ***
regionCincinnatiDayton:year2017     1.805e-02  4.713e-02   0.383 0.701811    
regionColumbus:year2017            -5.727e-02  4.713e-02  -1.215 0.224378    
regionDallasFtWorth:year2017        1.633e-05  4.713e-02   0.000 0.999724    
regionDenver:year2017               7.087e-02  4.713e-02   1.504 0.132705    
regionDetroit:year2017             -9.829e-02  4.713e-02  -2.085 0.037040 *  
regionGrandRapids:year2017          1.123e-01  4.713e-02   2.383 0.017189 *  
regionGreatLakes:year2017          -7.076e-04  4.713e-02  -0.015 0.988023    
regionHarrisburgScranton:year2017   2.504e-02  4.713e-02   0.531 0.595237    
regionHartfordSpringfield:year2017  4.143e-02  4.713e-02   0.879 0.379365    
regionHouston:year2017             -4.310e-02  4.713e-02  -0.914 0.360486    
regionIndianapolis:year2017        -1.113e-01  4.713e-02  -2.362 0.018208 *  
regionJacksonville:year2017         6.935e-02  4.713e-02   1.471 0.141188    
regionLasVegas:year2017            -5.010e-02  4.713e-02  -1.063 0.287846    
regionLosAngeles:year2017           1.258e-01  4.713e-02   2.669 0.007623 ** 
regionLouisville:year2017          -3.643e-02  4.713e-02  -0.773 0.439599    
regionMiamiFtLauderdale:year2017    1.549e-01  4.713e-02   3.287 0.001016 ** 
regionMidsouth:year2017             7.014e-02  4.713e-02   1.488 0.136728    
regionNashville:year2017           -1.363e-01  4.713e-02  -2.892 0.003836 ** 
regionNewOrleansMobile:year2017     5.228e-02  4.713e-02   1.109 0.267311    
regionNewYork:year2017              6.631e-02  4.713e-02   1.407 0.159498    
regionNortheast:year2017            5.123e-02  4.713e-02   1.087 0.277109    
regionNorthernNewEngland:year2017   4.724e-03  4.713e-02   0.100 0.920160    
regionOrlando:year2017              8.178e-02  4.713e-02   1.735 0.082730 .  
regionPhiladelphia:year2017         5.299e-02  4.713e-02   1.124 0.260891    
regionPhoenixTucson:year2017        1.635e-02  4.713e-02   0.347 0.728647    
regionPittsburgh:year2017          -1.434e-01  4.713e-02  -3.042 0.002355 ** 
regionPlains:year2017              -2.649e-02  4.713e-02  -0.562 0.574052    
regionPortland:year2017             2.843e-02  4.713e-02   0.603 0.546348    
regionRaleighGreensboro:year2017    2.202e-01  4.713e-02   4.671 3.01e-06 ***
regionRichmondNorfolk:year2017      2.565e-02  4.713e-02   0.544 0.586360    
regionRoanoke:year2017              3.211e-02  4.713e-02   0.681 0.495754    
regionSacramento:year2017           2.209e-01  4.713e-02   4.688 2.78e-06 ***
regionSanDiego:year2017             2.112e-01  4.713e-02   4.481 7.46e-06 ***
regionSanFrancisco:year2017         2.458e-01  4.713e-02   5.215 1.86e-07 ***
regionSeattle:year2017              7.805e-02  4.713e-02   1.656 0.097751 .  
regionSouthCarolina:year2017       -7.398e-02  4.713e-02  -1.570 0.116516    
regionSouthCentral:year2017        -4.827e-02  4.713e-02  -1.024 0.305789    
regionSoutheast:year2017           -1.622e-03  4.713e-02  -0.034 0.972549    
regionSpokane:year2017              1.051e-01  4.713e-02   2.229 0.025817 *  
regionStLouis:year2017             -1.065e-02  4.713e-02  -0.226 0.821183    
regionSyracuse:year2017            -3.868e-02  4.713e-02  -0.821 0.411787    
regionTampa:year2017                1.636e-01  4.713e-02   3.472 0.000519 ***
regionTotalUS:year2017              8.012e-02  4.713e-02   1.700 0.089167 .  
regionWest:year2017                 5.313e-02  4.713e-02   1.127 0.259636    
regionWestTexNewMexico:year2017    -7.563e-02  4.730e-02  -1.599 0.109859    
regionAtlanta:year2018              1.109e-02  7.733e-02   0.143 0.885972    
regionBaltimoreWashington:year2018  1.124e-01  7.733e-02   1.454 0.146096    
regionBoise:year2018                2.217e-01  7.733e-02   2.866 0.004156 ** 
regionBoston:year2018               2.060e-01  7.733e-02   2.664 0.007725 ** 
regionBuffaloRochester:year2018    -2.154e-01  7.733e-02  -2.786 0.005341 ** 
regionCalifornia:year2018           1.983e-01  7.733e-02   2.564 0.010347 *  
regionCharlotte:year2018            9.647e-03  7.733e-02   0.125 0.900720    
regionChicago:year2018              2.605e-01  7.733e-02   3.369 0.000756 ***
regionCincinnatiDayton:year2018     1.764e-01  7.733e-02   2.282 0.022523 *  
regionColumbus:year2018             7.372e-04  7.733e-02   0.010 0.992394    
regionDallasFtWorth:year2018        1.279e-01  7.733e-02   1.655 0.098035 .  
regionDenver:year2018               1.960e-01  7.733e-02   2.534 0.011284 *  
regionDetroit:year2018             -5.744e-02  7.733e-02  -0.743 0.457661    
regionGrandRapids:year2018          1.490e-02  7.733e-02   0.193 0.847176    
regionGreatLakes:year2018           5.500e-02  7.733e-02   0.711 0.476957    
regionHarrisburgScranton:year2018  -3.205e-02  7.733e-02  -0.414 0.678539    
regionHartfordSpringfield:year2018  3.263e-02  7.733e-02   0.422 0.673085    
regionHouston:year2018              9.692e-02  7.733e-02   1.253 0.210099    
regionIndianapolis:year2018        -7.173e-02  7.733e-02  -0.928 0.353643    
regionJacksonville:year2018         5.651e-02  7.733e-02   0.731 0.464972    
regionLasVegas:year2018             1.278e-01  7.733e-02   1.653 0.098372 .  
regionLosAngeles:year2018           3.021e-01  7.733e-02   3.906 9.41e-05 ***
regionLouisville:year2018           7.641e-02  7.733e-02   0.988 0.323126    
regionMiamiFtLauderdale:year2018    6.353e-02  7.733e-02   0.821 0.411391    
regionMidsouth:year2018             1.099e-01  7.733e-02   1.421 0.155277    
regionNashville:year2018            4.821e-02  7.733e-02   0.623 0.533060    
regionNewOrleansMobile:year2018     3.939e-02  7.733e-02   0.509 0.610495    
regionNewYork:year2018              3.298e-02  7.733e-02   0.426 0.669761    
regionNortheast:year2018            3.333e-02  7.733e-02   0.431 0.666443    
regionNorthernNewEngland:year2018   5.080e-02  7.733e-02   0.657 0.511237    
regionOrlando:year2018             -4.183e-02  7.733e-02  -0.541 0.588600    
regionPhiladelphia:year2018        -3.526e-03  7.733e-02  -0.046 0.963637    
regionPhoenixTucson:year2018        1.008e-01  7.733e-02   1.303 0.192425    
 [ reached getOption("max.print") -- omitted 20 rows ]
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2415 on 18029 degrees of freedom
Multiple R-squared:  0.6447,    Adjusted R-squared:  0.6404 
F-statistic: 149.4 on 219 and 18029 DF,  p-value: < 2.2e-16
model5pf <- lm(average_price ~ type + region + quarter + year + quarter:year, data = trimmed_avocados)
summary(model5pf)

Call:
lm(formula = average_price ~ type + region + quarter + year + 
    quarter:year, data = trimmed_avocados)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.96042 -0.13634 -0.00203  0.13537  1.48398 

Coefficients: (3 not defined because of singularities)
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)                1.259208   0.014541  86.600  < 2e-16 ***
typeorganic                0.495932   0.003553 139.577  < 2e-16 ***
regionAtlanta             -0.223077   0.018461 -12.084  < 2e-16 ***
regionBaltimoreWashington -0.026805   0.018461  -1.452 0.146526    
regionBoise               -0.212899   0.018461 -11.532  < 2e-16 ***
regionBoston              -0.030148   0.018461  -1.633 0.102472    
regionBuffaloRochester    -0.044201   0.018461  -2.394 0.016662 *  
regionCalifornia          -0.165710   0.018461  -8.976  < 2e-16 ***
regionCharlotte            0.045000   0.018461   2.438 0.014795 *  
regionChicago             -0.004260   0.018461  -0.231 0.817490    
regionCincinnatiDayton    -0.351834   0.018461 -19.058  < 2e-16 ***
regionColumbus            -0.308254   0.018461 -16.698  < 2e-16 ***
regionDallasFtWorth       -0.475444   0.018461 -25.754  < 2e-16 ***
regionDenver              -0.342456   0.018461 -18.550  < 2e-16 ***
regionDetroit             -0.284941   0.018461 -15.435  < 2e-16 ***
regionGrandRapids         -0.056036   0.018461  -3.035 0.002406 ** 
regionGreatLakes          -0.222485   0.018461 -12.052  < 2e-16 ***
regionHarrisburgScranton  -0.047751   0.018461  -2.587 0.009700 ** 
regionHartfordSpringfield  0.257604   0.018461  13.954  < 2e-16 ***
regionHouston             -0.513107   0.018461 -27.794  < 2e-16 ***
regionIndianapolis        -0.247041   0.018461 -13.382  < 2e-16 ***
regionJacksonville        -0.050089   0.018461  -2.713 0.006669 ** 
regionLasVegas            -0.180118   0.018461  -9.757  < 2e-16 ***
regionLosAngeles          -0.345030   0.018461 -18.690  < 2e-16 ***
regionLouisville          -0.274349   0.018461 -14.861  < 2e-16 ***
regionMiamiFtLauderdale   -0.132544   0.018461  -7.180 7.25e-13 ***
regionMidsouth            -0.156272   0.018461  -8.465  < 2e-16 ***
regionNashville           -0.348935   0.018461 -18.901  < 2e-16 ***
regionNewOrleansMobile    -0.256243   0.018461 -13.880  < 2e-16 ***
regionNewYork              0.166538   0.018461   9.021  < 2e-16 ***
regionNortheast            0.040888   0.018461   2.215 0.026785 *  
regionNorthernNewEngland  -0.083639   0.018461  -4.531 5.92e-06 ***
regionOrlando             -0.054822   0.018461  -2.970 0.002985 ** 
regionPhiladelphia         0.071095   0.018461   3.851 0.000118 ***
regionPhoenixTucson       -0.336598   0.018461 -18.233  < 2e-16 ***
regionPittsburgh          -0.196716   0.018461 -10.656  < 2e-16 ***
regionPlains              -0.124527   0.018461  -6.745 1.57e-11 ***
regionPortland            -0.243314   0.018461 -13.180  < 2e-16 ***
regionRaleighGreensboro   -0.005917   0.018461  -0.321 0.748575    
regionRichmondNorfolk     -0.269704   0.018461 -14.609  < 2e-16 ***
regionRoanoke             -0.313107   0.018461 -16.961  < 2e-16 ***
regionSacramento           0.060533   0.018461   3.279 0.001044 ** 
regionSanDiego            -0.162870   0.018461  -8.822  < 2e-16 ***
regionSanFrancisco         0.243166   0.018461  13.172  < 2e-16 ***
regionSeattle             -0.118462   0.018461  -6.417 1.43e-10 ***
regionSouthCarolina       -0.157751   0.018461  -8.545  < 2e-16 ***
regionSouthCentral        -0.459793   0.018461 -24.906  < 2e-16 ***
regionSoutheast           -0.163018   0.018461  -8.830  < 2e-16 ***
regionSpokane             -0.115444   0.018461  -6.253 4.11e-10 ***
regionStLouis             -0.130414   0.018461  -7.064 1.67e-12 ***
regionSyracuse            -0.040710   0.018461  -2.205 0.027452 *  
regionTampa               -0.152189   0.018461  -8.244  < 2e-16 ***
regionTotalUS             -0.242012   0.018461 -13.109  < 2e-16 ***
regionWest                -0.288817   0.018461 -15.645  < 2e-16 ***
regionWestTexNewMexico    -0.296594   0.018502 -16.030  < 2e-16 ***
quarter2                   0.021204   0.009058   2.341 0.019248 *  
quarter3                   0.082991   0.009058   9.162  < 2e-16 ***
quarter4                  -0.010357   0.009060  -1.143 0.252944    
year2016                  -0.117821   0.009058 -13.007  < 2e-16 ***
year2017                  -0.056574   0.009058  -6.246 4.31e-10 ***
year2018                  -0.004613   0.009245  -0.499 0.617792    
quarter2:year2016         -0.028533   0.012810  -2.227 0.025932 *  
quarter3:year2016          0.095192   0.012810   7.431 1.12e-13 ***
quarter4:year2016          0.256768   0.012811  20.043  < 2e-16 ***
quarter2:year2017          0.208350   0.012812  16.262  < 2e-16 ***
quarter3:year2017          0.312536   0.012810  24.398  < 2e-16 ***
quarter4:year2017          0.261262   0.012696  20.578  < 2e-16 ***
quarter2:year2018                NA         NA      NA       NA    
quarter3:year2018                NA         NA      NA       NA    
quarter4:year2018                NA         NA      NA       NA    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.24 on 18182 degrees of freedom
Multiple R-squared:  0.6461,    Adjusted R-squared:  0.6448 
F-statistic: 502.9 on 66 and 18182 DF,  p-value: < 2.2e-16

So it looks like model5pa with the type, region, quarter, year, and type:region is the best, with a moderate gain in multiple-r2 due to the interaction. However, we need to test for the significance of the interaction given the various p-values of the associated coefficients

Neat, it looks like including the interaction is statistically justified. So we can keep it in. And our final model is:

average_price ~ type + region + quarter + year + type:region
LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKV2F0Y2hpbmcgSmFtaWUgYnVpbGQgdGhlIG1vZGVsCgoKYGBge3J9CmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KEdHYWxseSkKbGlicmFyeShtb2RlbHIpCmxpYnJhcnkoamFuaXRvcikKYGBgCgpgYGB7cn0KYXZvY2Fkb3MgPC0gY2xlYW5fbmFtZXMocmVhZF9jc3YoImRhdGEvYXZvY2Fkby5jc3YiKSkKaGVhZChhdm9jYWRvcykKYGBgCgojIFByZXBhcmUgdGhlIGRhdGEKCk9rLCB3ZSBoYXZlIDE0IHZhcmlhYmxlcy4gQ2FuIGFscmVhZHkgc2VlIHRoYXQgc29tZSBvZiB0aGVtIGFyZSBzb21ld2hhdCB1c2VsZXNzICh4MSBmb3IgZXhhbXBsZSkuIE5vdCBzdXJlIHdoZXRoZXIgdGhlIHRvdGFsX2JhZ3MgdmFyaWFibGUgaXMgdGhlIHN1bSBvZiBzbWFsbF9iYWdzLCBsYXJnZV9iYWdzIGFuZCB4X2xhcmdlX2JhZ3Mgc28gSeKAmWxsIGNoZWNrIHRoYXQgZmlyc3QuCgpgYGB7cn0KIyBjaGVjayB0byBzZWUgaWYgdG90YWxfYmFncyB2YXJpYWJsZSBpcyBqdXN0IHRoZSBzdW0gb2YgdGhlIG90aGVyIHRocmVlCmF2b2NhZG9zICU+JQogIG11dGF0ZSh0b3RhbF9zdW0gPSBzbWFsbF9iYWdzICsgbGFyZ2VfYmFncyArIHhfbGFyZ2VfYmFncykgJT4lCiAgc2VsZWN0KHRvdGFsX2JhZ3MsIHRvdGFsX3N1bSkKYGBgCgpZZXAsIHRoZSB0b3RhbF9iYWdzIGNvbHVtbiBpcyBqdXN0IGEgc3VtIG9mIHRoZSBvdGhlciB0aHJlZS4gU28gdGhpcyBpcyBhIGFub3RoZXIgdmFyaWFibGUgSSBjYW4gZ2V0IHJpZCBvZi4gSSBjYW4gYWxzbyBjaGVjayB0aGUgc2FtZSBmb3Igdm9sdW1lOgoKYGBge3J9CiMgY2hlY2sgdG8gc2VlIGlmIHRvdGFsX3ZvbHVtZSB2YXJpYWJsZSBpcyBqdXN0IHRoZSBzdW0gb2YgdGhlIG90aGVyIHRocmVlCmF2b2NhZG9zICU+JQogIG11dGF0ZSh0b3RhbF9zdW0gPSB4NDA0NiArIHg0MjI1ICsgeDQ3NzApICU+JQogIHNlbGVjdCh0b3RhbF92b2x1bWUsIHRvdGFsX3N1bSkKYGBgCgpOb3BlLCB0aGVzZSBhcmVu4oCZdCB0aGUgc2FtZSwgc28gd2UgY2FuIGtlZXAgYWxsIHRoZXNlIGluLgoKCgpOb3cgbGV04oCZcyBjaGVjayBob3cgbWFueSBkaWZmZXJlbnQgbGV2ZWxzIG9mIGVhY2ggY2F0ZWdvcmljYWwgdmFyaWFibGUgd2UgaGF2ZS4KCmBgYHtyfQphdm9jYWRvcyAlPiUKICBkaXN0aW5jdChyZWdpb24pICU+JQogIHN1bW1hcmlzZShudW1iZXJfb2ZfcmVnaW9ucyA9IG4oKSkKYGBgCgpgYGB7cn0KYXZvY2Fkb3MgJT4lCiAgZGlzdGluY3QoZGF0ZSkgJT4lCiAgc3VtbWFyaXNlKAogICAgbnVtYmVyX29mX2RhdGVzID0gbigpLAogICAgbWluX2RhdGUgPSBtaW4oZGF0ZSksCiAgICBtYXhfZGF0ZSA9IG1heChkYXRlKQogICkKYGBgCgpUaGUgcmVnaW9uIHZhcmlhYmxlIHdpbGwgbGVhZCB0byBtYW55IGNhdGVnb3JpY2FsIGxldmVscywgYnV0IHdlIGNhbiB0cnkgbGVhdmluZyBpdCBpbi4gV2Ugc2hvdWxkIGFsc28gZXhhbWluZSBkYXRlIGFuZCBwZXJoYXBzIHB1bGwgb3V0IGZyb20gaXQgd2hhdGV2ZXIgZmVhdHVyZXMgd2UgY2FuLiBJbmNsdWRpbmcgZXZlcnkgc2luZ2xlIGRhdGUgd291bGQgYmUgdG9vIG11Y2gsIHNvIHdlIGNhbiBleHRyYWN0IHRoZSBkaWZmZXJlbnQgcGFydHMgb2YgdGhlIGRhdGUgdGhhdCBtaWdodCBiZSB1c2VmdWwuIEZvciBleGFtcGxlLCB3ZSBjb3VsZCB0cnkgYW5kIHNwbGl0IGl0IGludG8gZGlmZmVyZW50IHF1YXJ0ZXJzLCBvciB5ZWFycy4KClNvLCBsZXTigJlzIGRvIHRoaXMgbm93LiBSZW1vdmUgdGhlIHZhcmlhYmxlcyB3ZSBkb27igJl0IG5lZWQsIGNoYW5nZSBvdXIgY2F0ZWdvcmljYWwgdmFyaWFibGVzIHRvIGZhY3RvcnMsIGFuZCBleHRyYWN0IHBhcnRzIG9mIHRoZSBkYXRlIGluIGNhc2UgdGhleSBhcmUgdXNlZnVsIChhbmQgZ2V0IHJpZCBvZiBkYXRlKS4KCmBgYHtyfQojIE5vdGUgdGhlIHF1YXJ0ZXIgYW5kIHllYXIgYXJlIGJvdGggZmFjdG9ycyBhbmQgbm90IG51bWVyaWMKbGlicmFyeShsdWJyaWRhdGUpCnRyaW1tZWRfYXZvY2Fkb3MgPC0gYXZvY2Fkb3MgJT4lCiAgbXV0YXRlKAogICAgcXVhcnRlciA9IGFzX2ZhY3RvcihxdWFydGVyKGRhdGUpKSwKICAgIHllYXIgPSBhc19mYWN0b3IoeWVhciksCiAgICB0eXBlID0gYXNfZmFjdG9yKHR5cGUpLAogICAgcmVnaW9uID0gYXNfZmFjdG9yKHJlZ2lvbikKICApICU+JQogIHNlbGVjdCgtYyh4MSwgZGF0ZSx0b3RhbF9iYWdzKSkKYGBgCgpOb3cgd2XigJl2ZSBkb25lIG91ciBjbGVhbmluZywgd2UgY2FuIGNoZWNrIGZvciBhbGlhc2VkIHZhcmlhYmxlcyAoaS5lLiBjb21iaW5hdGlvbnMgb2YgdmFyaWFibGVzIGluIHdoaWNoIG9uZSBvciBtb3JlIG9mIHRoZSB2YXJpYWJsZXMgY2FuIGJlIGNhbGN1bGF0ZWQgZXhhY3RseSBmcm9tIG90aGVyIHZhcmlhYmxlcyk6CgpgYGB7cn0KYWxpYXMoYXZlcmFnZV9wcmljZSB+IC4sIGRhdGEgPSB0cmltbWVkX2F2b2NhZG9zICkKYGBgCgpOaWNlLCB3ZSBkb27igJl0IGZpbmQgYW55IGFsaWFzZXMuIFNvIHdlIGNhbiBrZWVwIGdvaW5nLgoKIyBGaXJzdFZhcmlhYmxlCgpXZSBuZWVkIHRvIGRlY2lkZSBvbiB3aGljaCB2YXJpYWJsZSB3ZSB3YW50IHRvIHB1dCBpbiBvdXIgbW9kZWwgZmlyc3QuIFRvIGRvIHRoaXMsIHdlIHNob3VsZCB2aXN1YWxpc2UgaXQuIEJlY2F1c2Ugd2UgaGF2ZSBzbyBtdWNoIGRhdGEsIGdncGFpcnMoKSBtaWdodCB0YWtlIGEgd2hpbGUgdG8gcnVuLCBzbyB3ZSBjYW4gc3BsaXQgaXQgdXAgYSBiaXQuCgpgYGB7cn0KIyBsZXQncyBzdGFydCBieSBwbG90dGluZyB0aGUgdm9sdW1lIHZhcmlhYmxlcwp0cmltbWVkX2F2b2NhZG9zICU+JQogIHNlbGVjdChhdmVyYWdlX3ByaWNlLCB0b3RhbF92b2x1bWUsIHg0MDQ2LCB4NDIyNSwgeDQ3NzApICU+JQogIGdncGFpcnMoKSArIAogICB0aGVtZV9ncmV5KGJhc2Vfc2l6ZSA9IDgpICMgZm9udCBzaXplIG9mIGxhYmVscwpgYGAKCkhtbSwgdGhlc2UgbG9vayBoaWdobHkgY29ycmVsYXRlZCB3aXRoIG9uZSBhbm90aGVyIGluIHNvbWUgaW5zdGFuY2VzLiBUaGlzIGlzIGEgc2lnbiB0aGF0IHdlIHdvbuKAmXQgaGF2ZSB0byBpbmNsdWRlIGFsbCBvZiB0aGVzZSBpbiBvdXIgbW9kZWwsIHNvIHdlIGNvdWxkIHRoaW5rIGFib3V0IHJlbW92aW5nIHg0MjI1IGFuZCB4NDc3MCBmcm9tIG91ciBkYXRhc2V0IHRvIGdpdmUgb3Vyc2VsdmVzIGZld2VyIHZhcmlhYmxlcy4KCmBgYHtyfQp0cmltbWVkX2F2b2NhZG9zIDwtIHRyaW1tZWRfYXZvY2Fkb3MgJT4lCiAgc2VsZWN0KC14NDIyNSwgLXg0NzcwKQpgYGAKCkluIHRlcm1zIG9mIHZhcmlhYmxlcyB0aGF0IGNvcnJlbGF0ZSB3ZWxsIHdpdGggYXZlcmFnZV9wcmljZeKApiB3ZWxsIG5vbmUgb2YgdGhlbSBkbywgdGhhdCB3ZWxsLiBCdXQgdGhhdOKAmXMgbGlmZS4gT3VyIHgwNDYgdmFyaWFibGUgaXMgcHJvYmFibHkgb3VyIGZpcnN0IGNhbmRpZGF0ZS4KCk5leHQgd2UgY2FuIGxvb2sgYXQgb3VyIHZvbHVtZSB2YXJpYWJsZXMuCgpgYGB7cn0KdHJpbW1lZF9hdm9jYWRvcyAlPiUKICBzZWxlY3QoYXZlcmFnZV9wcmljZSwgc21hbGxfYmFncywgbGFyZ2VfYmFncywgeF9sYXJnZV9iYWdzKSAlPiUKICBnZ3BhaXJzKCkgKyAKICAgdGhlbWVfZ3JleShiYXNlX3NpemUgPSA4KSAjIGZvbnQgc2l6ZSBvZiBsYWJlbHMKYGBgCgpIbW0sIGFnYWlu4oCmIG5vdCB0aGF0IHByb21pc2luZy4gU29tZSBvZiB0aGUgdmFyaWFibGVzIGFyZSBoaWdobHkgY29ycmVsYXRlZCB3aXRoIG9uZSBhbm90aGVyLCBidXQgbm90IG11Y2ggc2VlbXMgaGlnaGx5IGNvcnJlbGF0ZWQgd2l0aCBhdmVyYWdlX3ByaWNlLgoKCgpXZSBjYW4gbG9vayBhdCBzb21lIG9mIG91ciBjYXRlZ29yaWNhbCB2YXJpYWJsZXMgbmV4dDoKCmBgYHtyfQp0cmltbWVkX2F2b2NhZG9zICU+JQogIHNlbGVjdChhdmVyYWdlX3ByaWNlLCB0eXBlLCB5ZWFyLCBxdWFydGVyKSAlPiUKICBnZ3BhaXJzKCkgKyAKICAgdGhlbWVfZ3JleShiYXNlX3NpemUgPSA4KSAjIGZvbnQgc2l6ZSBvZiBsYWJlbHMKYGBgCgpUaGlzIHNlZW1zIGJldHRlciEgT3VyIHR5cGUgdmFyaWFibGUgc2VlbXMgdG8gc2hvdyB2YXJpYXRpb24gaW4gdGhlIGJveHBsb3RzLiBUaGlzIG1pZ2h0IHN1Z2dlc3QgdGhhdCBjb252ZW50aW9uYWwgYXZvY2Fkb3MgYW5kIG9yZ2FuaWMgb25lcyBoYXZlIGRpZmZlcmVudCBwcmljZXMgKHdoaWNoIGFnYWluLCBtYWtlcyBzZW5zZSkuCgpGaW5hbGx5LCB3ZSBjYW4gbWFrZSBhIGJveHBsb3Qgb2Ygb3VyIHJlZ2lvbiB2YXJpYWJsZS4gQmVjYXVzZSB0aGlzIGhhcyBzbyBtYW55IGxldmVscywgaXQgbWFrZXMgc2Vuc2UgdG8gcGxvdCBpdCBieSBpdHNlbGYgc28gd2UgY2FuIHNlZSBpdC4KCmBgYHtyfQp0cmltbWVkX2F2b2NhZG9zICU+JQogIGdncGxvdChhZXMoeCA9IHJlZ2lvbiwgeSA9IGF2ZXJhZ2VfcHJpY2UpKSArCiAgZ2VvbV9ib3hwbG90KCkgKwogIHRoZW1lKGF4aXMudGV4dC54ID0gZWxlbWVudF90ZXh0KGFuZ2xlID0gOTAsIGhqdXN0ID0gMSwgdmp1c3QgPSAwLjUpKQpgYGAKCk9rLCBzZWVtcyB0aGVyZSBpcyBzb21lIHZhcmlhdGlvbiBpbiB0aGUgYm94cGxvdHMgYmV0d2VlbiBkaWZmZXJlbnQgcmVnaW9ucywgc28gdGhhdCBzZWVtcyBsaWtlIGl0IGNvdWxkIGJlIHByb21pc2luZy4KCgoKTGV04oCZcyBzdGFydCBieSB0ZXN0IGNvbXBldGluZyBtb2RlbHMuIFdlIGRlY2lkZWQgdGhhdCB4NDA0NiwgdHlwZSwgYW5kIHJlZ2lvbiBzZWVtZWQgcmVhc29uYWJsZToKCmBgYHtyfQpsaWJyYXJ5KGdnZm9ydGlmeSkKCiMgYnVpbGQgdGhlIG1vZGVsIAptb2RlbDFhIDwtIGxtKGF2ZXJhZ2VfcHJpY2UgfiB4NDA0NiwgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCgojIGNoZWNrIHRoZSBkaWFnbm9zdGljcwphdXRvcGxvdChtb2RlbDFhKQpgYGAKCmBgYHtyfQojIGNoZWNrIHRoZSBzdW1tYXJ5IG91dHB1dApzdW1tYXJ5KG1vZGVsMWEpCmBgYAoKYGBge3J9CiMgYnVpbGQgdGhlIG1vZGVsIAptb2RlbDFiIDwtIGxtKGF2ZXJhZ2VfcHJpY2UgfiB0eXBlLCBkYXRhID0gdHJpbW1lZF9hdm9jYWRvcykKCiMgY2hlY2sgdGhlIGRpYWdub3N0aWNzCmF1dG9wbG90KG1vZGVsMWIpCmBgYAoKYGBge3J9CiMgY2hlY2sgdGhlIHN1bW1hcnkgb3V0cHV0CnN1bW1hcnkobW9kZWwxYikKYGBgCgpgYGB7cn0KIyBidWlsZCB0aGUgbW9kZWwgCm1vZGVsMWMgPC0gbG0oYXZlcmFnZV9wcmljZSB+IHJlZ2lvbiwgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCgojIGNoZWNrIHRoZSBkaWFnbm9zdGljcwphdXRvcGxvdChtb2RlbDFjKQpgYGAKCmBgYHtyfQojIGNoZWNrIHRoZSBzdW1tYXJ5IG91dHB1dApzdW1tYXJ5KG1vZGVsMWMpCmBgYAoKbW9kZWwxYiB3aXRoIHR5cGUgaXMgYmVzdCwgc28gd2XigJlsbCBrZWVwIHRoYXQgYW5kIHJlLXJ1biBnZ3BhaXJzKCkgd2l0aCB0aGUgcmVzaWR1YWxzIChhZ2FpbiBvbWl0dGluZyByZWdpb24gYmVjYXVzZSBpdOKAmXMgdG9vIGJpZykuCgojIFNlY29uZCBWYXJpYWJsZQoKYGBge3J9CmF2b2NhZG9zX3JlbWFpbmluZ19yZXNpZCA8LSB0cmltbWVkX2F2b2NhZG9zICU+JQogIGFkZF9yZXNpZHVhbHMobW9kZWwxYikgJT4lCiAgc2VsZWN0KC1jKCJhdmVyYWdlX3ByaWNlIiwgInR5cGUiLCAicmVnaW9uIikpCgpnZ3BhaXJzKGF2b2NhZG9zX3JlbWFpbmluZ19yZXNpZCkgKyAKICB0aGVtZV9ncmV5KGJhc2Vfc2l6ZSA9IDgpICMgdGhpcyBiaXQganVzdCBjaGFuZ2VzIHRoZSBheGlzIGxhYmVsIGZvbnQgc2l6ZSBzbyB3ZSBjYW4gc2VlCmBgYAoKQWdhaW4sIHRoaXMgaXNu4oCZdCBzaG93aW5nIGFueSByZWFsbHkgaGlnaCBjb3JyZWxhdGlvbnMgYmV0d2VlbiB0aGUgcmVzaWR1YWxzIGFuZCBhbnkgb2Ygb3VyIG51bWVyaWMgdmFyaWFibGVzLiBMb29rcyBsaWtlIHg0MDQ2LCB5ZWFyLCBxdWFydGVyIGNvdWxkIHNob3cgc29tZXRoaW5nIHBvdGVudGlhbGx5IChnaXZlbiB0aGUgcnViYmlzaCB2YXJpYWJsZXMgd2UgaGF2ZSkuCgpgYGB7cn0KdHJpbW1lZF9hdm9jYWRvcyAlPiUKICBhZGRfcmVzaWR1YWxzKG1vZGVsMWIpICU+JQogIGdncGxvdChhZXMoeCA9IHJlZ2lvbiwgeSA9IHJlc2lkKSkgKwogIGdlb21fYm94cGxvdCgpICsKICB0aGVtZShheGlzLnRleHQueCA9IGVsZW1lbnRfdGV4dChhbmdsZSA9IDkwLCBoanVzdCA9IDEsIHZqdXN0ID0gMC41KSkKYGBgCgpMb29rcyBsaWtlIHJlZ2lvbiBhcmUgb3VyIG5leHQgY29udGVuZGVycyB0byB0cnkuIExldOKAmXMgZG8gdGhlc2Ugbm93LgoKYGBge3J9Cm1vZGVsMmEgPC0gbG0oYXZlcmFnZV9wcmljZSB+IHR5cGUgKyB4NDA0NiwgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCmF1dG9wbG90KG1vZGVsMmEpCmBgYAoKYGBge3J9CnN1bW1hcnkobW9kZWwyYSkKYGBgCgpgYGB7cn0KbW9kZWwyYiA8LSBsbShhdmVyYWdlX3ByaWNlIH4gdHlwZSArIHllYXIsIGRhdGEgPSB0cmltbWVkX2F2b2NhZG9zKQphdXRvcGxvdChtb2RlbDJiKQpgYGAKCmBgYHtyfQpzdW1tYXJ5KG1vZGVsMmIpCmBgYAoKYGBge3J9Cm1vZGVsMmMgPC0gbG0oYXZlcmFnZV9wcmljZSB+IHR5cGUgKyBxdWFydGVyLCBkYXRhID0gdHJpbW1lZF9hdm9jYWRvcykKYXV0b3Bsb3QobW9kZWwyYykKYGBgCgpgYGB7cn0Kc3VtbWFyeShtb2RlbDJjKQpgYGAKCmBgYHtyfQptb2RlbDJkIDwtIGxtKGF2ZXJhZ2VfcHJpY2UgfiB0eXBlICsgcmVnaW9uLCBkYXRhID0gdHJpbW1lZF9hdm9jYWRvcykKYXV0b3Bsb3QobW9kZWwyZCkKYGBgCgpgYGB7cn0Kc3VtbWFyeShtb2RlbDJkKQpgYGAKClNvIG1vZGVsMmQgd2l0aCB0eXBlIGFuZCByZWdpb24gY29tZXMgb3V0IGFzIGJldHRlciBoZXJlLiBXZSBoYXZlIHNvbWUgcmVnaW9uIGNvZWZmaWNpZW50cyB0aGF0IGFyZSBub3Qgc2lnbmlmaWNhbnQgYXQgMC4wNSBsZXZlbCwgc28gbGV04oCZcyBydW4gYW4gYW5vdmEoKSB0byB0ZXN0IHdoZXRoZXIgdG8gaW5jbHVkZSByZWdpb24KCmBgYHtyfQojIG1vZGVsMWIgaXMgdGhlIG1vZGVsIHdpdGggYXZlcmFnZV9wcmljZSB+IHR5cGUKIyBtb2RlbDJkIGlzIHRoZSBtb2RlbCB3aXRoIGF2ZXJhZ2VfcHJpY2UgfiB0eXBlICsgcmVnaW9uCgojIHdlIHdhbnQgdG8gY29tcGFyZSB0aGUgdHdvCmFub3ZhKG1vZGVsMWIsIG1vZGVsMmQpCmBgYAoKSXQgc2VlbXMgcmVnaW9uIGlzIHNpZ25pZmljYW50IG92ZXJhbGwsIHNvIHdl4oCZbGwga2VlcCBpdCBpbiEKCiMgVGhpcmQgVmFyaWFibGUKCk1vZGVsMmQgaXMgb3VyIG1vZGVsIHdpdGggYXZlcmFnZV9wcmljZSB+IHR5cGUgKyByZWdpb24sIGFuZCBpdCBleHBsYWlucyAwLjU0NzMgb2YgdGhlIHZhcmlhbmNlIGluIGF2ZXJhZ2UgcHJpY2UuIFRoaXMgaXNu4oCZdCByZWFsbHkgdmVyeSBoaWdoLCBzbyB3ZSBjYW4gdGhpbmsgYWJvdXQgYWRkaW5nIGEgdGhpcmQgcHJlZGljdG9yIG5vdy4gQWdhaW4sIHdlIHdhbnQgdG8gcmVtb3ZlIHRoZXNlIHZhcmlhYmxlcyBmcm9tIG91ciBkYXRhLCBhbmQgY2hlY2sgdGhlIHJlc2lkdWFscy4KCmBgYHtyfQphdm9jYWRvc19yZW1haW5pbmdfcmVzaWQgPC0gdHJpbW1lZF9hdm9jYWRvcyAlPiUKICBhZGRfcmVzaWR1YWxzKG1vZGVsMmQpICU+JQogIHNlbGVjdCgtYygiYXZlcmFnZV9wcmljZSIsICJ0eXBlIiwgInJlZ2lvbiIpKQoKZ2dwYWlycyhhdm9jYWRvc19yZW1haW5pbmdfcmVzaWQpICsgCiAgIHRoZW1lX2dyZXkoYmFzZV9zaXplID0gOCkgIyBmb250IHNpemUgb2YgbGFiZWxzCmBgYAoKVGhlIG5leHQgY29udGVuZGVyIHZhcmlhYmxlcyBsb29rIHRvIGJlIHhfbGFyZ2VfYmFncywgeWVhciBhbmQgcXVhcnRlci4gTGV04oCZcyB0cnkgdGhlbSBvdXQuCgpgYGB7cn0KbW9kZWwzYSA8LSBsbShhdmVyYWdlX3ByaWNlIH4gdHlwZSArIHJlZ2lvbiArIHhfbGFyZ2VfYmFncywgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCmF1dG9wbG90KG1vZGVsM2EpCmBgYAoKYGBge3J9CnN1bW1hcnkobW9kZWwzYSkKYGBgCgpgYGB7cn0KbW9kZWwzYiA8LSBsbShhdmVyYWdlX3ByaWNlIH4gdHlwZSArIHJlZ2lvbiArIHllYXIsIGRhdGEgPSB0cmltbWVkX2F2b2NhZG9zKQphdXRvcGxvdChtb2RlbDNiKQpgYGAKCmBgYHtyfQpzdW1tYXJ5KG1vZGVsM2IpCmBgYAoKYGBge3J9Cm1vZGVsM2MgPC0gbG0oYXZlcmFnZV9wcmljZSB+IHR5cGUgKyByZWdpb24gKyBxdWFydGVyLCBkYXRhID0gdHJpbW1lZF9hdm9jYWRvcykKYXV0b3Bsb3QobW9kZWwzYykKYGBgCgpgYGB7cn0Kc3VtbWFyeShtb2RlbDNjKQpgYGAKClNvIG1vZGVsM2Mgd2l0aCB0eXBlLCByZWdpb24gYW5kIHF1YXJ0ZXIgd2lucyBvdXQgaGVyZS4gRXZlcnl0aGluZyBzdGlsbCBsb29rcyByZWFzb25hYmxlIHdpdGggdGhlIGRpYWdub3N0aWNzLCBwZXJoYXBzIHNvbWUgbWlsZCBoZXRlcm9zY2VkYXN0aWNpdHkuCgojIEZvdXJ0aCBWYXJpYWJsZQoKUmVtZW1iZXIgd2l0aCB0d28gcHJlZGljdG9ycywgb3VyIFJeMiB2YXJpYWJsZSB3YXMgdXAgYXQgMC41NDczLiBOb3csIHdpdGggdGhyZWUgcHJlZGljdG9ycywgd2UgYXJlIGF0IDAuNTg3NC4gT2ssIHRoYXQgc2VlbXMgcmVhc29uYWJsZSBhcyBhbiBpbXByb3ZlbWVudC4gU28gbGV04oCZcyBzZWUgaG93IG11Y2ggaW1wcm92ZW1lbnQgd2UgZ2V0IGJ5IGFkZGluZyBhIGZvdXJ0aCB2YXJpYWJsZS4gQWdhaW4sIGNoZWNrIHRoZSByZXNpZHVhbHMgdG8gc2VlIHdoaWNoIG9uZXMgd2Ugc2hvdWxkIHRyeSBhZGQuCgpgYGB7cn0KYXZvY2Fkb3NfcmVtYWluaW5nX3Jlc2lkIDwtIHRyaW1tZWRfYXZvY2Fkb3MgJT4lCiAgYWRkX3Jlc2lkdWFscyhtb2RlbDNjKSAlPiUKICBzZWxlY3QoLWMoImF2ZXJhZ2VfcHJpY2UiLCAidHlwZSIsICJyZWdpb24iLCAicXVhcnRlciIpKQoKZ2dwYWlycyhhdm9jYWRvc19yZW1haW5pbmdfcmVzaWQpICsgCiAgIHRoZW1lX2dyZXkoYmFzZV9zaXplID0gOCkgIyBmb250IHNpemUgb2YgbGFiZWxzCmBgYAoKVGhlIGNvbnRlbmRlciB2YXJpYWJsZXMgaGVyZSBhcmUgeF9sYXJnZV9iYWdzIGFuZCB5ZWFyLCBzbyBsZXTigJlzIHRyeSB0aGVtIG91dC4KCmBgYHtyfQptb2RlbDRhIDwtIGxtKGF2ZXJhZ2VfcHJpY2UgfiB0eXBlICsgcmVnaW9uICsgcXVhcnRlciArIHhfbGFyZ2VfYmFncywgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCmF1dG9wbG90KG1vZGVsNGEpCmBgYAoKYGBge3J9CnN1bW1hcnkobW9kZWw0YSkKYGBgCgpgYGB7cn0KbW9kZWw0YiA8LSBsbShhdmVyYWdlX3ByaWNlIH4gdHlwZSArIHJlZ2lvbiArIHF1YXJ0ZXIgKyB5ZWFyLCBkYXRhID0gdHJpbW1lZF9hdm9jYWRvcykKYXV0b3Bsb3QobW9kZWw0YikKYGBgCgpgYGB7cn0Kc3VtbWFyeShtb2RlbDRiKQpgYGAKCkhtbSwgbW9kZWw0YiB3aXRoIHR5cGUsIHJlZ2lvbiwgcXVhcnRlciBhbmQgeWVhciB3aW5zIGhlcmUuIEFuZCBpdCBoYXMgaW1wcm92ZWQgb3VyIG1vZGVsIHBlcmZvcm1hbmNlIGZyb20gMC41ODc0ICh3aXRoIHRocmVlIHByZWRpY3RvcnMpIHRvIDAuNjIxMy4gVGhhdOKAmXMgcXVpdGUgZ29vZC4KCiMgRmlmdGggVmFyaWFibGUKCldlIGFyZSBsaWtlbHkgbm93IHB1cnN1aW5nIHZhcmlhYmxlcyB3aXRoIHJhdGhlciBsaW1pdGVkIGV4cGxhbmF0b3J5IHBvd2VyLCBidXQgbGV04oCZcyBjaGVjayBmb3Igb25lIG1vcmUgbWFpbiBlZmZlY3QsIGFuZCBzZWUgaG93IG11Y2ggcHJlZGljdGl2ZSBwb3dlciBpdCBnaXZlcyB1cy4KCmBgYHtyfQphdm9jYWRvc19yZW1haW5pbmdfcmVzaWQgPC0gdHJpbW1lZF9hdm9jYWRvcyAlPiUKICBhZGRfcmVzaWR1YWxzKG1vZGVsNGIpICU+JQogIHNlbGVjdCgtYygiYXZlcmFnZV9wcmljZSIsICJ0eXBlIiwgInJlZ2lvbiIsICJxdWFydGVyIiwgInllYXIiKSkKCmdncGFpcnMoYXZvY2Fkb3NfcmVtYWluaW5nX3Jlc2lkKSArIAogICB0aGVtZV9ncmV5KGJhc2Vfc2l6ZSA9IDgpICMgZm9udCBzaXplIG9mIGxhYmVscwpgYGAKCkl0IGxvb2tzIGxpa2UgeF9sYXJnZV9iYWdzIGlzIHRoZSByZW1haW5pbmcgY29udGVuZGVyLCBsZXTigJlzIGNoZWNrIGl0IG91dCEKCmBgYHtyfQptb2RlbDUgPC0gbG0oYXZlcmFnZV9wcmljZSB+IHR5cGUgKyByZWdpb24gKyBxdWFydGVyICsgeWVhciArIHhfbGFyZ2VfYmFncywgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCmF1dG9wbG90KG1vZGVsNSkKYGBgCgpgYGB7cn0Kc3VtbWFyeShtb2RlbDUpCmBgYAoKT3ZlcmFsbCwgd2Ugc3RpbGwgaGF2ZSBzb21lIGhldGVyc2NlZGFzdGljaXR5IGFuZCBkZXZpYXRpb25zIGZyb20gbm9ybWFsaXR5IGluIHRoZSByZXNpZHVhbHMuIEluIHRlcm1zIG9mIG91ciByZWdyZXNzaW9uIHN1bW1hcnksIGl0IGlzIGEgc2lnbmlmaWNhbnQgZXhwbGFuYXRvcnkgdmFyaWFibGUsIGFuZCBpdCBpcyBzaWduaWZpY2FudC4gQnV0IGhtbW3igKYgd2l0aCBmb3VyIHByZWRpY3RvcnMsIG91ciBvdmVyYWxsIFJeMiB3YXMgMC42MjEzLCBhbmQgbm93IHdpdGggZml2ZSB3ZeKAmXZlIG9ubHkgcmVhY2hlZCAwLjYyMTQuIEdpdmVuIHRoYXQgdGhlcmUgaXMgbm8gcmVhbCBpbmNyZWFzZSBpbiBleHBsYW5hdG9yeSBwZXJmb3JtYW5jZSwgZXZlbiB0aG91Z2ggaXTigJlzIHNpZ25pZmljYW50LCB3ZSBtaWdodCB3YW50IHRvIHJlbW92ZSBpdC4gTGV04oCZcyBkbyB0aGlzIG5vdy4KCkl04oCZcyBhbHNvIGNsZWFyIHdlIGFyZW7igJl0IGdhaW5pbmcgYW55dGhpbmcgYnkgYWRkaW5nIHByZWRpY3RvcnMuIFRoZSBmaW5hbCB0aGluZyB3ZSBjYW4gZG8gaXMgdGVzdCBmb3IgaW50ZXJhY3Rpb25zLgoKIyBQYWlyIGludGVyYWN0aW9uCgpMZXTigJlzIG5vdyB0aGluayBhYm91dCBwb3NzaWJsZSBwYWlyIGludGVyYWN0aW9uczogZm9yIGZvdXIgbWFpbiBlZmZlY3QgdmFyaWFibGVzICh0eXBlICsgcmVnaW9uICsgcXVhcnRlciArIHllYXIpLCBzbyB3ZSBoYXZlIHNpeCBwb3NzaWJsZSBwYWlyIGludGVyYWN0aW9ucy4gTGV04oCZcyB0ZXN0IHRoZW0gb3V0LgoKdHlwZTpyZWdpb24KdHlwZTpxdWFydGVyCnR5cGU6eWVhcgpyZWdpb246cXVhcnRlcgpyZWdpb246eWVhcgpxdWFydGVyOnllYXIKTGV04oCZcyB0ZXN0IHRoZXNlIG5vdzoKCmBgYHtyfQptb2RlbDVwYSA8LSBsbShhdmVyYWdlX3ByaWNlIH4gdHlwZSArIHJlZ2lvbiArIHF1YXJ0ZXIgKyB5ZWFyICsgdHlwZTpyZWdpb24sIGRhdGEgPSB0cmltbWVkX2F2b2NhZG9zKQpzdW1tYXJ5KG1vZGVsNXBhKQpgYGAKCmBgYHtyfQptb2RlbDVwYiA8LSBsbShhdmVyYWdlX3ByaWNlIH4gdHlwZSArIHJlZ2lvbiArIHF1YXJ0ZXIgKyB5ZWFyICsgdHlwZTpxdWFydGVyLCBkYXRhID0gdHJpbW1lZF9hdm9jYWRvcykKc3VtbWFyeShtb2RlbDVwYikKYGBgCgpgYGB7cn0KbW9kZWw1cGMgPC0gbG0oYXZlcmFnZV9wcmljZSB+IHR5cGUgKyByZWdpb24gKyBxdWFydGVyICsgeWVhciArIHR5cGU6eWVhciwgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCnN1bW1hcnkobW9kZWw1cGMpCmBgYAoKYGBge3J9Cm1vZGVsNXBkIDwtIGxtKGF2ZXJhZ2VfcHJpY2UgfiB0eXBlICsgcmVnaW9uICsgcXVhcnRlciArIHllYXIgKyByZWdpb246cXVhcnRlciwgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCnN1bW1hcnkobW9kZWw1cGQpCmBgYAoKYGBge3J9Cm1vZGVsNXBlIDwtIGxtKGF2ZXJhZ2VfcHJpY2UgfiB0eXBlICsgcmVnaW9uICsgcXVhcnRlciArIHllYXIgKyByZWdpb246eWVhciwgZGF0YSA9IHRyaW1tZWRfYXZvY2Fkb3MpCnN1bW1hcnkobW9kZWw1cGUpCmBgYAoKYGBge3J9Cm1vZGVsNXBmIDwtIGxtKGF2ZXJhZ2VfcHJpY2UgfiB0eXBlICsgcmVnaW9uICsgcXVhcnRlciArIHllYXIgKyBxdWFydGVyOnllYXIsIGRhdGEgPSB0cmltbWVkX2F2b2NhZG9zKQpzdW1tYXJ5KG1vZGVsNXBmKQpgYGAKClNvIGl0IGxvb2tzIGxpa2UgbW9kZWw1cGEgd2l0aCB0aGUgdHlwZSwgcmVnaW9uLCBxdWFydGVyLCB5ZWFyLCBhbmQgdHlwZTpyZWdpb24gaXMgdGhlIGJlc3QsIHdpdGggYSBtb2RlcmF0ZSBnYWluIGluIG11bHRpcGxlLXIyIGR1ZSB0byB0aGUgaW50ZXJhY3Rpb24uIEhvd2V2ZXIsIHdlIG5lZWQgdG8gdGVzdCBmb3IgdGhlIHNpZ25pZmljYW5jZSBvZiB0aGUgaW50ZXJhY3Rpb24gZ2l2ZW4gdGhlIHZhcmlvdXMgcC12YWx1ZXMgb2YgdGhlIGFzc29jaWF0ZWQgY29lZmZpY2llbnRzCgpOZWF0LCBpdCBsb29rcyBsaWtlIGluY2x1ZGluZyB0aGUgaW50ZXJhY3Rpb24gaXMgc3RhdGlzdGljYWxseSBqdXN0aWZpZWQuIFNvIHdlIGNhbiBrZWVwIGl0IGluLiBBbmQgb3VyIGZpbmFsIG1vZGVsIGlzOgoKICAgIGF2ZXJhZ2VfcHJpY2UgfiB0eXBlICsgcmVnaW9uICsgcXVhcnRlciArIHllYXIgKyB0eXBlOnJlZ2lvbgoK